[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2016-03-16 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197056#comment-15197056
 ] 

Lefty Leverenz commented on HIVE-11675:
---

Doc note:  This adds configuration parameter 
*hive.orc.splits.ms.footer.cache.ppd.enabled* to HiveConf.java, so it needs to 
be documented in the ORC section of Configuration Properties for release 2.1.0.

* [Configuration Properties -- ORC File Format | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-ORCFileFormat]

This also fixes a typo in the description of *hive.orc.splits.include.fileid*, 
which was added to the llap branch by HIVE-10067 and to master for release 
2.0.0 by HIVE-11542.

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.1
> Fix For: 2.1.0
>
> Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, 
> HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, 
> HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, 
> HIVE-11675.09.patch, HIVE-11675.10.patch, HIVE-11675.11.patch, 
> HIVE-11675.12.patch, HIVE-11675.13.patch, HIVE-11675.14.patch, 
> HIVE-11675.patch, HIVE-11675.premature.opti.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2016-03-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196423#comment-15196423
 ] 

Hive QA commented on HIVE-11675:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12793363/HIVE-11675.14.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9827 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7275/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7275/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7275/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12793363 - PreCommit-HIVE-TRUNK-Build

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, 
> HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, 
> HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, 
> HIVE-11675.09.patch, HIVE-11675.10.patch, HIVE-11675.11.patch, 
> HIVE-11675.12.patch, HIVE-11675.13.patch, HIVE-11675.14.patch, 
> HIVE-11675.patch, HIVE-11675.premature.opti.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2016-03-11 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15190791#comment-15190791
 ] 

Hive QA commented on HIVE-11675:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12792168/HIVE-11675.12.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 9806 tests 
executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge10
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge11
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_coalesce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_cast
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_udf
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_elt
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_nvl
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_10
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_11
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_6
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_div0
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_limit
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_math_funcs
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_string_funcs
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_timestamp_ints_casts
org.apache.hadoop.hive.ql.io.orc.TestOrcSplitElimination.testExternalFooterCachePpd
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7217/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7217/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7217/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 21 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12792168 - PreCommit-HIVE-TRUNK-Build

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, 
> HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, 
> HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, 
> HIVE-11675.09.patch, HIVE-11675.10.patch, HIVE-11675.11.patch, 
> HIVE-11675.12.patch, HIVE-11675.patch, HIVE-11675.premature.opti.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2016-03-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185352#comment-15185352
 ] 

Hive QA commented on HIVE-11675:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12791834/HIVE-11675.11.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 29 failed/errored test(s), 9793 tests 
executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge10
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge11
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge6
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge7
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_vectorization_ppd
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_coalesce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_cast
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_udf
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_elt
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_nvl
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_10
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_11
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_6
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_div0
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_limit
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_math_funcs
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_string_funcs
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_timestamp_ints_casts
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.io.orc.TestOrcSplitElimination.testExternalFooterCachePpd
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7194/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7194/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7194/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 29 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12791834 - PreCommit-HIVE-TRUNK-Build

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, 
> HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, 
> HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, 
> HIVE-11675.09.patch, HIVE-11675.10.patch, HIVE-11675.11.patch, 
> HIVE-11675.patch, HIVE-11675.premature.opti.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2016-03-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183798#comment-15183798
 ] 

Prasanth Jayachandran commented on HIVE-11675:
--

new changes look good to me, +1. Pending another test run

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, 
> HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, 
> HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, 
> HIVE-11675.09.patch, HIVE-11675.10.patch, HIVE-11675.11.patch, 
> HIVE-11675.patch, HIVE-11675.premature.opti.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2016-03-06 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182689#comment-15182689
 ] 

Prasanth Jayachandran commented on HIVE-11675:
--

Mostly looks good to me, +1. Some minor comments in RB. 

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, 
> HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, 
> HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, 
> HIVE-11675.09.patch, HIVE-11675.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2016-03-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178463#comment-15178463
 ] 

Sergey Shelukhin commented on HIVE-11675:
-

All the failures are known issues.

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, 
> HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, 
> HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, 
> HIVE-11675.09.patch, HIVE-11675.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2016-03-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177879#comment-15177879
 ] 

Hive QA commented on HIVE-11675:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12790842/HIVE-11675.09.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9767 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7152/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7152/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7152/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12790842 - PreCommit-HIVE-TRUNK-Build

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, 
> HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, 
> HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, 
> HIVE-11675.09.patch, HIVE-11675.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2016-03-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174797#comment-15174797
 ] 

Hive QA commented on HIVE-11675:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12790631/HIVE-11675.08.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 9765 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.io.orc.TestOrcSplitElimination.testExternalFooterCachePpd
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7139/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7139/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7139/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12790631 - PreCommit-HIVE-TRUNK-Build

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, 
> HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, 
> HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, 
> HIVE-11675.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2016-02-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170615#comment-15170615
 ] 

Hive QA commented on HIVE-11675:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12790211/HIVE-11675.07.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7110/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7110/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7110/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[INFO] Building jar: 
/data/hive-ptest/working/apache-github-source-source/service-rpc/target/hive-service-rpc-2.1.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
hive-service-rpc ---
[INFO] 
[INFO] --- maven-jar-plugin:2.2:test-jar (default) @ hive-service-rpc ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-github-source-source/service-rpc/target/hive-service-rpc-2.1.0-SNAPSHOT-tests.jar
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ 
hive-service-rpc ---
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/service-rpc/target/hive-service-rpc-2.1.0-SNAPSHOT.jar
 to 
/data/hive-ptest/working/maven/org/apache/hive/hive-service-rpc/2.1.0-SNAPSHOT/hive-service-rpc-2.1.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/service-rpc/pom.xml to 
/data/hive-ptest/working/maven/org/apache/hive/hive-service-rpc/2.1.0-SNAPSHOT/hive-service-rpc-2.1.0-SNAPSHOT.pom
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/service-rpc/target/hive-service-rpc-2.1.0-SNAPSHOT-tests.jar
 to 
/data/hive-ptest/working/maven/org/apache/hive/hive-service-rpc/2.1.0-SNAPSHOT/hive-service-rpc-2.1.0-SNAPSHOT-tests.jar
[INFO] 
[INFO] 
[INFO] Building Spark Remote Client 2.1.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ spark-client ---
[INFO] Deleting 
/data/hive-ptest/working/apache-github-source-source/spark-client/target
[INFO] Deleting 
/data/hive-ptest/working/apache-github-source-source/spark-client (includes = 
[datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
spark-client ---
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ spark-client 
---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ 
spark-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ spark-client ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ spark-client 
---
[INFO] Compiling 28 source files to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/classes
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java:
 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java
 uses or overrides a deprecated API.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java:
 Recompile with -Xlint:deprecation for details.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java:
 Some input files use unchecked or unsafe operations.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java:
 Recompile with -Xlint:unchecked for details.
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
spark-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ spark-client ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp
[mkdir] Created dir: 

[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2016-02-26 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169715#comment-15169715
 ] 

Sergey Shelukhin commented on HIVE-11675:
-

[~prasanth_j] can you review? The new files are mostly moved/renamed code, with 
some small changes. Every time this patch sits for a while it gets a lot of 
conflicts.

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, 
> HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, 
> HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2016-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146252#comment-15146252
 ] 

Hive QA commented on HIVE-11675:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12787601/HIVE-11675.05.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 221 failed/errored test(s), 9729 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_mat_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_mat_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_mat_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_mat_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_mat_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_coltype_literals
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_5
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_3
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_4
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_acid_globallimit
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_alter_merge_2_orc
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_alter_merge_orc
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_gby
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_gby_empty
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_limit
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_semijoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_simple_select
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_stats
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_subq_exists
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_subq_in
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_subq_not_in
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_udf_udaf
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_union
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_views
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_windowing
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cte_5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cte_mat_4
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cte_mat_5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_all_non_partitioned
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_all_partitioned
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_orig_table
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_tmp_table
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_where_no_match
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_where_non_partitioned
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_where_partitioned
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_whole_partition
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_4
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_hybridgrace_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_insert_orig_table
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_insert_update_delete
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_limit_pushdown
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_llap_nullscan
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_llapdecider
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mapjoin_decimal
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge1

[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2016-02-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130017#comment-15130017
 ] 

Hive QA commented on HIVE-11675:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12785648/HIVE-11675.04.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10033 tests 
executed
*Failed tests:*
{noformat}
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testExecuteStatementAsync
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6848/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6848/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6848/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12785648 - PreCommit-HIVE-TRUNK-Build

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, 
> HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2016-01-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125198#comment-15125198
 ] 

Hive QA commented on HIVE-11675:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12785052/HIVE-11675.03.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6806/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6806/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6806/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-6806/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 208ab35 HIVE-12727 : refactor Hive strict checks to be more 
granular, allow order by no limit and no partition filter by default for now 
(Sergey Shelukhin, reviewed by Xuefu Zhang) ADDENDUM2
+ git clean -f -d
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at 208ab35 HIVE-12727 : refactor Hive strict checks to be more 
granular, allow order by no limit and no partition filter by default for now 
(Sergey Shelukhin, reviewed by Xuefu Zhang) ADDENDUM2
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12785052 - PreCommit-HIVE-TRUNK-Build

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, 
> HIVE-11675.03.patch, HIVE-11675.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2015-11-24 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025704#comment-15025704
 ] 

Prasanth Jayachandran commented on HIVE-11675:
--

Left some comments in RB

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, 
> HIVE-11675.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2015-11-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013476#comment-15013476
 ] 

Hive QA commented on HIVE-11675:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12772876/HIVE-11675.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9865 tests executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6074/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6074/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6074/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12772876 - PreCommit-HIVE-TRUNK-Build

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, 
> HIVE-11675.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2015-10-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14946092#comment-14946092
 ] 

Sergey Shelukhin commented on HIVE-11675:
-

[~gopalv] [~prasanth_j] fyi

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2015-10-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14946091#comment-14946091
 ] 

Sergey Shelukhin commented on HIVE-11675:
-

Here's a final, but untested, patch. It applies on top of many uncommitted 
patches and some committed patches that are now impossible to merge thru the 
web of local branches to the base branch of this branch. I will unfcuk it all 
after other patches are committed.

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2015-10-01 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940719#comment-14940719
 ] 

Sergey Shelukhin commented on HIVE-11675:
-

I have patch long in the works but I keep getting distracted by other random 
crap.


> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2015-08-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717846#comment-14717846
 ] 

Sergey Shelukhin commented on HIVE-11675:
-

[~gopalv] fyi

 make use of file footer PPD API in ETL strategy or separate strategy
 

 Key: HIVE-11675
 URL: https://issues.apache.org/jira/browse/HIVE-11675
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

 Need to take a look at the best flow. It won't be much different if we do 
 filtering metastore call for each partition. So perhaps we'd need the custom 
 sync point/batching after all.
 Or we can make it opportunistic and not fetch any footers unless it can be 
 pushed down to metastore or fetched from local cache, that way the only slow 
 threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)