[jira] [Commented] (HIVE-15956) StackOverflowError when drop lots of partitions
[ https://issues.apache.org/jira/browse/HIVE-15956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906444#comment-15906444 ] Zoltan Haindrich commented on HIVE-15956: - [~niklaus.xiao] I've tried your patch...and I've still seen the problem after applying the fix. Did it work for you? - it might be possible that I've screwed something up - but after a clean rebuild it still failed > StackOverflowError when drop lots of partitions > --- > > Key: HIVE-15956 > URL: https://issues.apache.org/jira/browse/HIVE-15956 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.3.0, 2.2.0 >Reporter: Niklaus Xiao >Assignee: Niklaus Xiao > Attachments: HIVE-15956.patch > > > Repro steps: > 1. Create partitioned table and add 1 partitions > {code} > create table test_partition(id int) partitioned by (dt int); > alter table test_partition add partition(dt=1); > alter table test_partition add partition(dt=3); > alter table test_partition add partition(dt=4); > ... > alter table test_partition add partition(dt=1); > {code} > 2. Drop 9000 partitions: > {code} > alter table test_partition drop partition(dt<9000); > {code} > Step 2 will fail with StackOverflowError: > {code} > Exception in thread "pool-7-thread-161" java.lang.StackOverflowError > at > org.datanucleus.query.expression.ExpressionCompiler.isOperator(ExpressionCompiler.java:819) > at > org.datanucleus.query.expression.ExpressionCompiler.compileOrAndExpression(ExpressionCompiler.java:190) > at > org.datanucleus.query.expression.ExpressionCompiler.compileExpression(ExpressionCompiler.java:179) > at > org.datanucleus.query.expression.ExpressionCompiler.compileOrAndExpression(ExpressionCompiler.java:192) > at > org.datanucleus.query.expression.ExpressionCompiler.compileExpression(ExpressionCompiler.java:179) > at > org.datanucleus.query.expression.ExpressionCompiler.compileOrAndExpression(ExpressionCompiler.java:192) > at > org.datanucleus.query.expression.ExpressionCompiler.compileExpression(ExpressionCompiler.java:179) > {code} > {code} > Exception in thread "pool-7-thread-198" java.lang.StackOverflowError > at > org.datanucleus.query.expression.DyadicExpression.bind(DyadicExpression.java:83) > at > org.datanucleus.query.expression.DyadicExpression.bind(DyadicExpression.java:87) > at > org.datanucleus.query.expression.DyadicExpression.bind(DyadicExpression.java:87) > at > org.datanucleus.query.expression.DyadicExpression.bind(DyadicExpression.java:87) > at > org.datanucleus.query.expression.DyadicExpression.bind(DyadicExpression.java:87) > at > org.datanucleus.query.expression.DyadicExpression.bind(DyadicExpression.java:87) > at > org.datanucleus.query.expression.DyadicExpression.bind(DyadicExpression.java:87) > at > org.datanucleus.query.expression.DyadicExpression.bind(DyadicExpression.java:87) > at > org.datanucleus.query.expression.DyadicExpression.bind(DyadicExpression.java:87) > at > org.datanucleus.query.expression.DyadicExpression.bind(DyadicExpression.java:87) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15867) Add blobstore tests for import/export
[ https://issues.apache.org/jira/browse/HIVE-15867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906445#comment-15906445 ] Sahil Takiar commented on HIVE-15867: - +1 (non-binding, still need a committer to take a look) > Add blobstore tests for import/export > - > > Key: HIVE-15867 > URL: https://issues.apache.org/jira/browse/HIVE-15867 > Project: Hive > Issue Type: Bug >Reporter: Thomas Poepping >Assignee: Juan Rodríguez Hortalá > Attachments: HIVE-15867.patch > > > This patch covers ten separate tests testing import and export operations > running against blobstore filesystems: > * Import addpartition > ** blobstore -> file > ** file -> blobstore > ** blobstore -> blobstore > ** blobstore -> hdfs > * import/export > ** blobstore -> file > ** file -> blobstore > ** blobstore -> blobstore (partitioned and non-partitioned) > ** blobstore -> HDFS (partitioned and non-partitioned) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-11019) Can't create an Avro table with uniontype column correctly
[ https://issues.apache.org/jira/browse/HIVE-11019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906418#comment-15906418 ] Nikita Goyal commented on HIVE-11019: - I have been facing the same issue. Can anyone look? > Can't create an Avro table with uniontype column correctly > -- > > Key: HIVE-11019 > URL: https://issues.apache.org/jira/browse/HIVE-11019 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Bing Li >Assignee: Bing Li > > I tried the example in > https://cwiki.apache.org/confluence/display/Hive/AvroSerDe > And found that it can't create an AVRO table correctly with uniontype > hive> create table avro_union(union1 uniontype)STORED > AS AVRO; > OK > Time taken: 0.083 seconds > hive> describe avro_union; > OK > union1 uniontype > > Time taken: 0.058 seconds, Fetched: 1 row(s) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15665) LLAP: OrcFileMetadata objects in cache can impact heap usage
[ https://issues.apache.org/jira/browse/HIVE-15665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906367#comment-15906367 ] Sergey Shelukhin commented on HIVE-15665: - Indexes are going to be read like regular streams... I was able to work on this a little bit more, I now have unfinished code for all 3 metadata cache constituents :) > LLAP: OrcFileMetadata objects in cache can impact heap usage > > > Key: HIVE-15665 > URL: https://issues.apache.org/jira/browse/HIVE-15665 > Project: Hive > Issue Type: Improvement > Components: llap >Reporter: Rajesh Balamohan >Assignee: Sergey Shelukhin > Attachments: HIVE-15665.WIP.patch > > > OrcFileMetadata internally has filestats, stripestats etc which are allocated > in heap. On large data sets, this could have an impact on the heap usage and > the memory usage by different executors in LLAP. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16133) Footer cache in Tez AM can take too much memory
[ https://issues.apache.org/jira/browse/HIVE-16133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16133: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to master. Thanks for the reviews! > Footer cache in Tez AM can take too much memory > --- > > Key: HIVE-16133 > URL: https://issues.apache.org/jira/browse/HIVE-16133 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Sergey Shelukhin > Fix For: 2.2.0 > > Attachments: HIVE-16133.01.patch, HIVE-16133.02.patch, > HIVE-16133.02.patch, HIVE-16133.03.patch, HIVE-16133.04.patch, > HIVE-16133.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16182) Semijoin: Avoid VectorHashKeyWrapper allocations for the bloom hash aggregate
[ https://issues.apache.org/jira/browse/HIVE-16182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906363#comment-15906363 ] Sergey Shelukhin commented on HIVE-16182: - +1 > Semijoin: Avoid VectorHashKeyWrapper allocations for the bloom hash aggregate > - > > Key: HIVE-16182 > URL: https://issues.apache.org/jira/browse/HIVE-16182 > Project: Hive > Issue Type: Improvement > Components: Vectorization >Affects Versions: 2.2.0 >Reporter: Gopal V >Assignee: Gopal V > Labels: performance > Attachments: HIVE-16182.1.patch > > > To avoid GC spam during the hash aggregate part of the bloom filter, the key > for the semijoin can be special-cased as an immutable empty key. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15978) Support regr_* functions
[ https://issues.apache.org/jira/browse/HIVE-15978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906337#comment-15906337 ] Hive QA commented on HIVE-15978: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857491/HIVE-15978.1.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10420 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udaf_binarysetfunctions] (batchId=35) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4091/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4091/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4091/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12857491 - PreCommit-HIVE-Build > Support regr_* functions > > > Key: HIVE-15978 > URL: https://issues.apache.org/jira/browse/HIVE-15978 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Zoltan Haindrich > Attachments: HIVE-15978.1.patch > > > Support the standard regr_* functions, regr_slope, regr_intercept, regr_r2, > regr_sxx, regr_syy, regr_sxy, regr_avgx, regr_avgy, regr_count. SQL reference > section 10.9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15978) Support regr_* functions
[ https://issues.apache.org/jira/browse/HIVE-15978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-15978: Status: Patch Available (was: Open) > Support regr_* functions > > > Key: HIVE-15978 > URL: https://issues.apache.org/jira/browse/HIVE-15978 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Zoltan Haindrich > Attachments: HIVE-15978.1.patch > > > Support the standard regr_* functions, regr_slope, regr_intercept, regr_r2, > regr_sxx, regr_syy, regr_sxy, regr_avgx, regr_avgy, regr_count. SQL reference > section 10.9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15978) Support regr_* functions
[ https://issues.apache.org/jira/browse/HIVE-15978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-15978: Attachment: HIVE-15978.1.patch [~pxiong] now I see that I never had any chance to not declare these as aggregators :) patch #1) I've retrofitted some existing aggregators to service the regr_ methods. > Support regr_* functions > > > Key: HIVE-15978 > URL: https://issues.apache.org/jira/browse/HIVE-15978 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Zoltan Haindrich > Attachments: HIVE-15978.1.patch > > > Support the standard regr_* functions, regr_slope, regr_intercept, regr_r2, > regr_sxx, regr_syy, regr_sxy, regr_avgx, regr_avgy, regr_count. SQL reference > section 10.9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16182) Semijoin: Avoid VectorHashKeyWrapper allocations for the bloom hash aggregate
[ https://issues.apache.org/jira/browse/HIVE-16182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906307#comment-15906307 ] Hive QA commented on HIVE-16182: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857490/HIVE-16182.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 10339 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4090/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4090/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4090/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12857490 - PreCommit-HIVE-Build > Semijoin: Avoid VectorHashKeyWrapper allocations for the bloom hash aggregate > - > > Key: HIVE-16182 > URL: https://issues.apache.org/jira/browse/HIVE-16182 > Project: Hive > Issue Type: Improvement > Components: Vectorization >Affects Versions: 2.2.0 >Reporter: Gopal V >Assignee: Gopal V > Labels: performance > Attachments: HIVE-16182.1.patch > > > To avoid GC spam during the hash aggregate part of the bloom filter, the key > for the semijoin can be special-cased as an immutable empty key. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16182) Semijoin: Avoid VectorHashKeyWrapper allocations for the bloom hash aggregate
[ https://issues.apache.org/jira/browse/HIVE-16182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-16182: --- Status: Patch Available (was: Open) > Semijoin: Avoid VectorHashKeyWrapper allocations for the bloom hash aggregate > - > > Key: HIVE-16182 > URL: https://issues.apache.org/jira/browse/HIVE-16182 > Project: Hive > Issue Type: Improvement > Components: Vectorization >Affects Versions: 2.2.0 >Reporter: Gopal V >Assignee: Gopal V > Labels: performance > Attachments: HIVE-16182.1.patch > > > To avoid GC spam during the hash aggregate part of the bloom filter, the key > for the semijoin can be special-cased as an immutable empty key. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16182) Semijoin: Avoid VectorHashKeyWrapper allocations for the bloom hash aggregate
[ https://issues.apache.org/jira/browse/HIVE-16182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-16182: --- Labels: performance (was: ) > Semijoin: Avoid VectorHashKeyWrapper allocations for the bloom hash aggregate > - > > Key: HIVE-16182 > URL: https://issues.apache.org/jira/browse/HIVE-16182 > Project: Hive > Issue Type: Improvement > Components: Vectorization >Affects Versions: 2.2.0 >Reporter: Gopal V >Assignee: Gopal V > Labels: performance > Attachments: HIVE-16182.1.patch > > > To avoid GC spam during the hash aggregate part of the bloom filter, the key > for the semijoin can be special-cased as an immutable empty key. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16182) Semijoin: Avoid VectorHashKeyWrapper allocations for the bloom hash aggregate
[ https://issues.apache.org/jira/browse/HIVE-16182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned HIVE-16182: -- Assignee: Gopal V > Semijoin: Avoid VectorHashKeyWrapper allocations for the bloom hash aggregate > - > > Key: HIVE-16182 > URL: https://issues.apache.org/jira/browse/HIVE-16182 > Project: Hive > Issue Type: Improvement > Components: Vectorization >Affects Versions: 2.2.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-16182.1.patch > > > To avoid GC spam during the hash aggregate part of the bloom filter, the key > for the semijoin can be special-cased as an immutable empty key. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16182) Semijoin: Avoid VectorHashKeyWrapper allocations for the bloom hash aggregate
[ https://issues.apache.org/jira/browse/HIVE-16182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-16182: --- Attachment: HIVE-16182.1.patch > Semijoin: Avoid VectorHashKeyWrapper allocations for the bloom hash aggregate > - > > Key: HIVE-16182 > URL: https://issues.apache.org/jira/browse/HIVE-16182 > Project: Hive > Issue Type: Improvement > Components: Vectorization >Affects Versions: 2.2.0 >Reporter: Gopal V > Attachments: HIVE-16182.1.patch > > > To avoid GC spam during the hash aggregate part of the bloom filter, the key > for the semijoin can be special-cased as an immutable empty key. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16181) Make logic for hdfs directory location extraction more generic, in webhcat test driver
[ https://issues.apache.org/jira/browse/HIVE-16181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aswathy Chellammal Sreekumar updated HIVE-16181: Attachment: HIVE-16181.1.patch > Make logic for hdfs directory location extraction more generic, in webhcat > test driver > -- > > Key: HIVE-16181 > URL: https://issues.apache.org/jira/browse/HIVE-16181 > Project: Hive > Issue Type: Test > Components: WebHCat >Reporter: Aswathy Chellammal Sreekumar >Priority: Minor > Attachments: HIVE-16181.1.patch > > > Patch to make regular expression for directory location lookup in > setLocationPermGroup of TestDriverCurl more generic to accommodate patterns > without port number like hdfs://mycluster//hive/warehouse/ -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16132) DataSize stats don't seem correct in semijoin opt branch
[ https://issues.apache.org/jira/browse/HIVE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906269#comment-15906269 ] Hive QA commented on HIVE-16132: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857484/HIVE-16132.6.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10339 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype] (batchId=153) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4089/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4089/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4089/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12857484 - PreCommit-HIVE-Build > DataSize stats don't seem correct in semijoin opt branch > > > Key: HIVE-16132 > URL: https://issues.apache.org/jira/browse/HIVE-16132 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-16132.1.patch, HIVE-16132.2.patch, > HIVE-16132.3.patch, HIVE-16132.4.patch, HIVE-16132.5.patch, HIVE-16132.6.patch > > > For the following operator tree snippet, the second Select is the start of a > semijoin optimization branch. Take a look at the Data size - it is the same > as the data size for its parent Select, even though the second select has > only a single bigint column in its projection (the parent has 2 columns). I > would expect the size to be 533328 (16 bytes * 3). > Fixing this estimate may become important if we need to estimate the cost of > generating the min/max/bloomfilter. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16132) DataSize stats don't seem correct in semijoin opt branch
[ https://issues.apache.org/jira/browse/HIVE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-16132: -- Attachment: HIVE-16132.6.patch Needed a code refresh locally. Result files updated. > DataSize stats don't seem correct in semijoin opt branch > > > Key: HIVE-16132 > URL: https://issues.apache.org/jira/browse/HIVE-16132 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-16132.1.patch, HIVE-16132.2.patch, > HIVE-16132.3.patch, HIVE-16132.4.patch, HIVE-16132.5.patch, HIVE-16132.6.patch > > > For the following operator tree snippet, the second Select is the start of a > semijoin optimization branch. Take a look at the Data size - it is the same > as the data size for its parent Select, even though the second select has > only a single bigint column in its projection (the parent has 2 columns). I > would expect the size to be 533328 (16 bytes * 3). > Fixing this estimate may become important if we need to estimate the cost of > generating the min/max/bloomfilter. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-8750) Commit initial encryption work
[ https://issues.apache.org/jira/browse/HIVE-8750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906150#comment-15906150 ] Lefty Leverenz edited comment on HIVE-8750 at 3/11/17 10:24 AM: The encryption branch was merged to trunk for release 1.1.0 (formerly known as 0.15). See HIVE-9264. So *hive.exec.stagingdir* and *hive.exec.copyfile.maxsize* need to be documented in the wiki. * [Configuration Properties -- Query and DDL Execution | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution] Adding a TODOC15 label (for release 1.1.0). Edit (11/Mar/17): HIVE-14864 corrects the description of *hive.exec.copyfile.maxsize* in release 2.2.0 -- its value is in bytes, not megabytes. was (Author: le...@hortonworks.com): The encryption branch was merged to trunk for release 1.1.0 (formerly known as 0.15). See HIVE-9264. So *hive.exec.stagingdir* and *hive.exec.copyfile.maxsize* need to be documented in the wiki. * [Configuration Properties -- Query and DDL Execution | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution] Adding a TODOC15 label (for release 1.1.0). > Commit initial encryption work > -- > > Key: HIVE-8750 > URL: https://issues.apache.org/jira/browse/HIVE-8750 > Project: Hive > Issue Type: Sub-task >Reporter: Brock Noland >Assignee: Sergio Peña > Labels: TODOC15 > Fix For: encryption-branch, 1.1.0 > > Attachments: HIVE-8750.1.patch > > > I believe Sergio has some work done for encryption. In this item we'll commit > it to branch. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-14864) Distcp is not called from MoveTask when src is a directory
[ https://issues.apache.org/jira/browse/HIVE-14864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906158#comment-15906158 ] Lefty Leverenz commented on HIVE-14864: --- Doc note: This adds *hive.exec.copyfile.maxnumfiles* to HiveConf.java and corrects the description of *hive.exec.copyfile.maxsize* (added in 1.1.0 by HIVE-8750 but not documented yet) so they need to be documented in the wiki. * [Configuration Properties -- Query and DDL Execution | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution] Added a TODOC2.2 label. > Distcp is not called from MoveTask when src is a directory > -- > > Key: HIVE-14864 > URL: https://issues.apache.org/jira/browse/HIVE-14864 > Project: Hive > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Sahil Takiar > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-14864.1.patch, HIVE-14864.2.patch, > HIVE-14864.3.patch, HIVE-14864.4.patch, HIVE-14864.patch > > > In FileUtils.java the following code does not get executed even when src > directory size is greater than HIVE_EXEC_COPYFILE_MAXSIZE because > srcFS.getFileStatus(src).getLen() returns 0 when src is a directory. We > should use srcFS.getContentSummary(src).getLength() instead. > {noformat} > /* Run distcp if source file/dir is too big */ > if (srcFS.getUri().getScheme().equals("hdfs") && > srcFS.getFileStatus(src).getLen() > > conf.getLongVar(HiveConf.ConfVars.HIVE_EXEC_COPYFILE_MAXSIZE)) { > LOG.info("Source is " + srcFS.getFileStatus(src).getLen() + " bytes. > (MAX: " + conf.getLongVar(HiveConf.ConfVars.HIVE_EXEC_COPYFILE_MAXSIZE) + > ")"); > LOG.info("Launch distributed copy (distcp) job."); > HiveConfUtil.updateJobCredentialProviders(conf); > copied = shims.runDistCp(src, dst, conf); > if (copied && deleteSource) { > srcFS.delete(src, true); > } > } > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-14864) Distcp is not called from MoveTask when src is a directory
[ https://issues.apache.org/jira/browse/HIVE-14864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-14864: -- Labels: TODOC2.2 (was: ) > Distcp is not called from MoveTask when src is a directory > -- > > Key: HIVE-14864 > URL: https://issues.apache.org/jira/browse/HIVE-14864 > Project: Hive > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Sahil Takiar > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-14864.1.patch, HIVE-14864.2.patch, > HIVE-14864.3.patch, HIVE-14864.4.patch, HIVE-14864.patch > > > In FileUtils.java the following code does not get executed even when src > directory size is greater than HIVE_EXEC_COPYFILE_MAXSIZE because > srcFS.getFileStatus(src).getLen() returns 0 when src is a directory. We > should use srcFS.getContentSummary(src).getLength() instead. > {noformat} > /* Run distcp if source file/dir is too big */ > if (srcFS.getUri().getScheme().equals("hdfs") && > srcFS.getFileStatus(src).getLen() > > conf.getLongVar(HiveConf.ConfVars.HIVE_EXEC_COPYFILE_MAXSIZE)) { > LOG.info("Source is " + srcFS.getFileStatus(src).getLen() + " bytes. > (MAX: " + conf.getLongVar(HiveConf.ConfVars.HIVE_EXEC_COPYFILE_MAXSIZE) + > ")"); > LOG.info("Launch distributed copy (distcp) job."); > HiveConfUtil.updateJobCredentialProviders(conf); > copied = shims.runDistCp(src, dst, conf); > if (copied && deleteSource) { > srcFS.delete(src, true); > } > } > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-8750) Commit initial encryption work
[ https://issues.apache.org/jira/browse/HIVE-8750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906150#comment-15906150 ] Lefty Leverenz commented on HIVE-8750: -- The encryption branch was merged to trunk for release 1.1.0 (formerly known as 0.15). See HIVE-9264. So *hive.exec.stagingdir* and *hive.exec.copyfile.maxsize* need to be documented in the wiki. * [Configuration Properties -- Query and DDL Execution | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution] Adding a TODOC15 label (for release 1.1.0). > Commit initial encryption work > -- > > Key: HIVE-8750 > URL: https://issues.apache.org/jira/browse/HIVE-8750 > Project: Hive > Issue Type: Sub-task >Reporter: Brock Noland >Assignee: Sergio Peña > Labels: TODOC15 > Fix For: encryption-branch, 1.1.0 > > Attachments: HIVE-8750.1.patch > > > I believe Sergio has some work done for encryption. In this item we'll commit > it to branch. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-8750) Commit initial encryption work
[ https://issues.apache.org/jira/browse/HIVE-8750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-8750: - Labels: TODOC15 (was: ) > Commit initial encryption work > -- > > Key: HIVE-8750 > URL: https://issues.apache.org/jira/browse/HIVE-8750 > Project: Hive > Issue Type: Sub-task >Reporter: Brock Noland >Assignee: Sergio Peña > Labels: TODOC15 > Fix For: encryption-branch, 1.1.0 > > Attachments: HIVE-8750.1.patch > > > I believe Sergio has some work done for encryption. In this item we'll commit > it to branch. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-8750) Commit initial encryption work
[ https://issues.apache.org/jira/browse/HIVE-8750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-8750: - Fix Version/s: 1.1.0 > Commit initial encryption work > -- > > Key: HIVE-8750 > URL: https://issues.apache.org/jira/browse/HIVE-8750 > Project: Hive > Issue Type: Sub-task >Reporter: Brock Noland >Assignee: Sergio Peña > Fix For: encryption-branch, 1.1.0 > > Attachments: HIVE-8750.1.patch > > > I believe Sergio has some work done for encryption. In this item we'll commit > it to branch. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16180) LLAP: Native memory leak in EncodedReader
[ https://issues.apache.org/jira/browse/HIVE-16180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906144#comment-15906144 ] Hive QA commented on HIVE-16180: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857456/HIVE-16180.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10339 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=141) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4088/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4088/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4088/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12857456 - PreCommit-HIVE-Build > LLAP: Native memory leak in EncodedReader > - > > Key: HIVE-16180 > URL: https://issues.apache.org/jira/browse/HIVE-16180 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: DirectCleaner.java, FullGC-15GB-cleanup.png, > Full-gc-native-mem-cleanup.png, HIVE-16180.1.patch, HIVE-16180.2.patch, > Native-mem-spike.png > > > Observed this in internal test run. There is a native memory leak in Orc > EncodedReaderImpl that can cause YARN pmem monitor to kill the container > running the daemon. Direct byte buffers are null'ed out which is not > guaranteed to be cleaned until next Full GC. To show this issue, attaching a > small test program that allocates 3x256MB direct byte buffers. First buffer > is null'ed out but still native memory is used. Second buffer user Cleaner to > clean up native allocation. Third buffer is also null'ed but this time > invoking a System.gc() which cleans up all native memory. Output from the > test program is below > {code} > Allocating 3x256MB direct memory.. > Native memory used: 786432000 > Native memory used after data1=null: 786432000 > Native memory used after data2.clean(): 524288000 > Native memory used after data3=null: 524288000 > Native memory used without gc: 524288000 > Native memory used after gc: 0 > {code} > Longer term improvements/solutions: > 1) Use DirectBufferPool from hadoop or netty's > https://netty.io/4.0/api/io/netty/buffer/PooledByteBufAllocator.html as > direct byte buffer allocations are expensive (System.gc() + 100ms thread > sleep). > 2) Use HADOOP-12760 for proper cleaner invocation in JDK8 and JDK9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15981) Allow empty grouping sets
[ https://issues.apache.org/jira/browse/HIVE-15981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906141#comment-15906141 ] Lefty Leverenz commented on HIVE-15981: --- Should this behavioral change be documented in the wiki? * [Group By | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+GroupBy] If so, please add a TODOC2.2 label. > Allow empty grouping sets > - > > Key: HIVE-15981 > URL: https://issues.apache.org/jira/browse/HIVE-15981 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Zoltan Haindrich > Fix For: 2.2.0 > > Attachments: HIVE-15981.1.patch, HIVE-15981.2.patch > > > group by () should be treated as equivalent to no group by clause. Currently > it throws a parse error -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16132) DataSize stats don't seem correct in semijoin opt branch
[ https://issues.apache.org/jira/browse/HIVE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906131#comment-15906131 ] Hive QA commented on HIVE-16132: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857449/HIVE-16132.5.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10339 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction] (batchId=148) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4087/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4087/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4087/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12857449 - PreCommit-HIVE-Build > DataSize stats don't seem correct in semijoin opt branch > > > Key: HIVE-16132 > URL: https://issues.apache.org/jira/browse/HIVE-16132 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-16132.1.patch, HIVE-16132.2.patch, > HIVE-16132.3.patch, HIVE-16132.4.patch, HIVE-16132.5.patch > > > For the following operator tree snippet, the second Select is the start of a > semijoin optimization branch. Take a look at the Data size - it is the same > as the data size for its parent Select, even though the second select has > only a single bigint column in its projection (the parent has 2 columns). I > would expect the size to be 533328 (16 bytes * 3). > Fixing this estimate may become important if we need to estimate the cost of > generating the min/max/bloomfilter. -- This message was sent by Atlassian JIRA (v6.3.15#6346)