Hi Tim,

Thanks for the update.

Regards,
Valencia



From:   Tim Armstrong <[email protected]>
To:     Valencia Serrao/Austin/Contr/IBM@IBMUS
Cc:     Alex Behm <[email protected]>, Casey Ching
            <[email protected]>, [email protected], Manish
            Patil/Austin/Contr/IBM@IBMUS, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS
Date:   07/20/2016 02:35 AM
Subject:        Re: Fw: Issues with generating testdata for Impala



Hi Valencia,
  I wasn't able to get a clear answer, but as far as we know it hasn't been
modified.

- Tim

On Tue, Jul 12, 2016 at 4:59 AM, Valencia Serrao <[email protected]>
wrote:
  Hi Tim,

  Thank you for responding.

  Please do let me know if any post-processing was done on the data at
  
https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
  .

  Regards,
  Valencia


  Inactive hide details for Tim Armstrong ---07/08/2016 01:31:46 AM---Hi
  Valencia,   The data is scale factor 1 for the TPC-H andTim Armstrong
  ---07/08/2016 01:31:46 AM---Hi Valencia, The data is scale factor 1 for
  the TPC-H and TPC-DS benchmarks:

  From: Tim Armstrong <[email protected]>
  To: Valencia Serrao/Austin/Contr/IBM@IBMUS
  Cc: Casey Ching <[email protected]>, Alex Behm <[email protected]>,
  [email protected], Nishidha
  Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
  Jagadale/Austin/Contr/IBM@IBMUS, Manish Patil/Austin/Contr/IBM@IBMUS
  Date: 07/08/2016 01:31 AM



  Subject: Re: Fw: Issues with generating testdata for Impala



  Hi Valencia,
    The data is scale factor 1 for the TPC-H and TPC-DS benchmarks:
  http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp


  I imagine you could reconstruct it using their data generators.

  I'm unsure if we modified those data generators at all or did any
  postprocessing. I'm going to check if anyone knows exactly how that data
  was generated originally.

  On Wed, Jul 6, 2016 at 10:52 PM, Valencia Serrao <[email protected]>
  wrote:
        Hi Casey/Alex/Tim,

        I need to know whether it is possible to generate the tpch and
        tpcds data without using the tar's you provided at
        
https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
        . Because when i tried to load data without using the tpch and
        tpcds tars, though functional-query data loaded successfully, I got
        the following error during the TPC-H data load step:

        Error: Error while compiling statement: FAILED: SemanticException
        Line 1:23 Invalid path
        ''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No files
        matching path file: /ImpalaPPC/testdata/impala-data/tpch/lineitem
        (state=42000,code=40000)
        org.apache.hive.service.cli.HiveSQLException: Error while compiling
        statement: FAILED: SemanticException Line 1:23 Invalid path
        ''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No files
        matching path file:/ImpalaPPC/testdata/impala-data/tpch/lineitem
        at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:235)
        at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:221)
        at org.apache.hive.jdbc.HiveStatement.execute
        (HiveStatement.java:244)
        at org.apache.hive.beeline.Commands.executeInternal
        (Commands.java:893)
        at org.apache.hive.beeline.Commands.execute(Commands.java:1079)
        at org.apache.hive.beeline.Commands.sql(Commands.java:976)
        at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1085)
        at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:917)
        at org.apache.hive.beeline.BeeLine.executeFile(BeeLine.java:895)
        at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:837)
        at org.apache.hive.beeline.BeeLine.mainWithInputRedirection
        (BeeLine.java:482)
        at org.apache.hive.beeline.BeeLine.main(BeeLine.java:465)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke
        (NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke
        (DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
        Caused by: org.apache.hive.service.cli.HiveSQLException: Error
        while compiling statement: FAILED: SemanticException Line 1:23
        Invalid path ''/ImpalaPPC/testdata/impala-data/tpch/lineitem'': No
        files matching path
        file:/ImpalaPPC/testdata/impala-data/tpch/lineitem


        Regards,
        Valencia

        Inactive hide details for Casey Ching ---05/04/2016 11:51:39
        AM---Comment inline below On May 3, 2016 at 11:18:06 PM, Alex Behm
        Casey Ching ---05/04/2016 11:51:39 AM---Comment inline below On May
        3, 2016 at 11:18:06 PM, Alex Behm ([email protected]) wrote:

        From: Casey Ching <[email protected]>
        To: Alex Behm <[email protected]>,
        [email protected]
        Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
        Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
        Serrao/Austin/Contr/IBM@IBMUS
        Date: 05/04/2016 11:51 AM
        Subject: Re: Fw: Issues with generating testdata for Impala




        Comment inline below


        On May 3, 2016 at 11:18:06 PM, Alex Behm ([email protected])
        wrote:


                                Hi Valencia,

                                I'm sorry you are having so much trouble
                                with our setup. Let's see what we
                                can do.

                                There was an infra issue with receiving the
                                logs you sent me. The
                                email/attachment got rejected on our side.
                                Maybe you can upload the logs
                                somewhere so I can grab them?

                                See more responses inline below.

                                On Sat, Apr 30, 2016 at 5:01 AM, Valencia
                                Serrao <[email protected]> wrote:

                                > Hi Alex,
                                >
                                > I was going more deeper through the logs.
                                I have some findings and queries:
                                >
                                > 1. At the "Invalidating Metadata" step
                                (as mentioned in below mail), i
                                > noticed that, it is trying to use
                                kerberos. Perhaps, this is preventing the
                                > testdata generation from proceeding, as
                                we are not using Kerberos.
                                > I need to know how this can be done
                                without involving Kerberos support ?
                                >
                                Kerberos is certainly not needed to build
                                and run tests.

                                >
                                > 2. I had executed the fe tests despite
                                the incomplete testdata generation,
                                > the tests started and surely have failed.
                                Many of these (null pointer
                                > exception in AuthorzationTests) have a
                                common cause: "tpch database does
                                > not exist."
                                > e.g. as shown
                                in 
.Impala/cluster_logs/query_tests/test-run-workload.log.

                                >
                                > Does the "tpch" database gets created
                                after the current blocker step
                                > "Invalidating Metadata" ?
                                >

                                Yes, the TPCH database is created and
                                loaded as part of that first phase.
                                However, the data files are not yet
                                publicly accessible. Let me work on
                                that from my side, and get back to you
                                soon. One way or the other we'll be
                                able to provide you with the data.

        The data is at
        
https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
         . The files are split into 50 MB pieces for git. You can put them
        back together as is done in
        
https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile

                                >
                                > 3. In the fe test console output log,
                                another error shown:
                                > ============================= test
                                session starts
                                > ==============================
                                > platform linux2 -- Python 2.7.5 --
                                py-1.4.30 -- pytest-2.7.2
                                > rootdir: /work/, inifile:
                                > plugins: random, xdist
                                > ERROR: file not found:/work/I
                                >
                                
mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/

                                >
                                > These are not present/created on my vm.
                                May i know when these get created ?
                                >
                                > 4. Could you also share the total number
                                of fe tests ?
                                >

                                I'll privately send you the console output
                                from a successful FE run.
                                Hopefully that can help.

                                Cheers,

                                Alex

                                >
                                >
                                > Looking forward to your reply.
                                >
                                > Regards,
                                > Valencia
                                >
                                >
                                > [image: Inactive hide details for
                                Valencia Serrao---04/30/2016 09:05:54
                                > AM---Hi Alex, I've been able to make some
                                progress on testdata]Valencia
                                > Serrao---04/30/2016 09:05:54 AM---Hi
                                Alex, I've been able to make some
                                > progress on testdata generation, however,
                                i still face the foll
                                >
                                > From: Valencia Serrao/Austin/Contr/IBM
                                > To: [email protected], Alex
                                Behm <[email protected]>
                                > Cc: Sudarshan
                                Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                                > Panpaliya/Austin/Contr/IBM@IBMUS,
                                Valencia Serrao/Austin/Contr/IBM@IBMUS
                                > Date: 04/30/2016 09:05 AM
                                > Subject: Fw: Issues with generating
                                testdata for Impala
                                > ------------------------------
                                >
                                >
                                >
                                > Hi Alex,
                                >
                                > I've been able to make some progress on
                                testdata generation, however, i
                                > still face the following issues:
                                >
                                >
                                >
                                
*******************************************************************************************************************************************************************

                                > Invalidating Metadata
                                >
                                >
                                
(load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):

                                > INSERT OVERWRITE TABLE
                                functional_parquet.alltypes partition
                                (year, month)
                                > SELECT id, bool_col, tinyint_col,
                                smallint_col, int_col, bigint_col,
                                > float_col, double_col, date_string_col,
                                string_col, timestamp_col, year,
                                > month
                                > FROM functional.alltypes
                                >
                                > Data Loading from Impala failed with
                                error: ImpalaBeeswaxException:
                                > INNER EXCEPTION: <class 'socket.error'>
                                > MESSAGE: [Errno 104] Connection reset by
                                peer
                                > Error
                                in 
/root/nishidha/Impala/testdata/bin/create-load-data.sh
 at line
                                > 41: while [ -n "$*" ]
                                > Error
                                in /root/nishidha/Impala/buildall.sh at
                                line 368:
                                > $
                                {IMPALA_HOME}/testdata/bin/create-load-data.sh
 ${CREATE_LOAD_DATA_ARGS}
                                > <<< Y
                                >
                                >
                                
*************************************************************************************************************************************************************************

                                >
                                > i continued with fe tests as is. Here is
                                the complete output log.
                                > [attachment "fe_test_output.zip" deleted
                                by Valencia
                                > Serrao/Austin/Contr/IBM]
                                >
                                > Cluster logs: [attachment
                                "cluster_logs.7z" deleted by Valencia
                                > Serrao/Austin/Contr/IBM]
                                >
                                > Kindly guide me on the same.
                                >
                                > Regards,
                                > Valencia
                                > ----- Forwarded by Valencia
                                Serrao/Austin/Contr/IBM on 04/29/2016 10:57
                                AM
                                > -----
                                >
                                > From: Sudarshan Jagadale/Austin/Contr/IBM

                                > To: Valencia
                                Serrao/Austin/Contr/IBM@IBMUS
                                > Date: 04/29/2016 10:49 AM
                                > Subject: Fw: Issues with generating
                                testdata for Impala
                                > ------------------------------
                                >
                                >
                                > FYI
                                > Thanks and Regards
                                > Sudarshan Jagadale
                                > Power Open Source Solutions
                                > ----- Forwarded by Sudarshan
                                Jagadale/Austin/Contr/IBM on 04/29/2016
                                10:48
                                > AM -----
                                >
                                > From: Alex Behm <[email protected]>
                                > To: [email protected]
                                > Cc: Sudarshan
                                Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                                > Panpaliya/Austin/Contr/IBM@IBMUS
                                > Date: 04/28/2016 09:34 PM
                                > Subject: Re: Issues with generating
                                testdata for Impala
                                > ------------------------------
                                >
                                >
                                >
                                > Hi Valencia,
                                >
                                > sorry I did not get the attachment. Would
                                you be able to tar.gz and attach
                                > the whole cluster_logs directory?
                                >
                                > Alex
                                >
                                > On Thu, Apr 28, 2016 at 6:23 AM, Valencia
                                Serrao <*[email protected]*
                                > <[email protected]>> wrote:
                                >
                                > Hi Alex,
                                >
                                > I tried building impala again with the
                                following:
                                > HDFS CDH 5.7.0 (
                                > *
                                
http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*

                                > <
                                
http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3
                                >
                                > )
                                > HBASE CDH 5.7.0 SNAPSHOT (
                                > *
                                
http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*

                                > <
                                
http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz
                                > )
                                > - this required to patch in a fix (
                                > *
                                
https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*

                                > <
                                
https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch
                                >
                                > )
                                > HIVE CDH 5.8.0 SNAPSHOT
                                >
                                > With the above combination, i'm able to
                                move past the exception and
                                > also have the RegionServer service up and
                                running. However, it now gives
                                > error as below:
                                >
                                >
                                >
                                
********************************************************************************************************************

                                >
                                
(load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                                > CREATE EXTERNAL TABLE IF NOT EXISTS
                                functional.decimal_tbl (
                                > d1 DECIMAL,
                                > d2 DECIMAL(10, 0),
                                > d3 DECIMAL(20, 10),
                                > d4 DECIMAL(38, 38),
                                > d5 DECIMAL(10, 5))
                                > PARTITIONED BY (d6 DECIMAL(9, 0))
                                > ROW FORMAT delimited fields terminated by
                                ','
                                > STORED AS TEXTFILE
                                > LOCATION '/test-warehouse/decimal_tbl'
                                >
                                >
                                
(load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                                > USE functional
                                >
                                >
                                
(load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                                > ALTER TABLE decimal_tbl ADD IF NOT EXISTS
                                PARTITION(d6=1)
                                >
                                > Data Loading from Impala failed with
                                error: ImpalaBeeswaxException:
                                > INNER EXCEPTION: <class
                                >
                                
'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>

                                > MESSAGE:
                                > Error: null
                                >
                                >
                                
******************************************************************************************************************

                                >
                                > Here is the complete log for the same.
                                *(See attached file:
                                > data-load-functional-exhaustive.log)*
                                >
                                > It would great if you could guide me on
                                this issue, so i could proceed
                                > with the fe tests.
                                >
                                > Still awaiting link to the source code of
                                HDFS CDH 5.8.0
                                >
                                > Regards,
                                > Valencia
                                >
                                >
                                >
                                >














Reply via email to