[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15450573#comment-15450573 ] Abdullah Yousufi commented on HIVE-14373: - Thanks for the comments [~spena] and [~poeppt]. Currently, the user passes in the path for the directory they want the tests to be run, so the user chooses where the tests should be run, similarly to a unique test ID being passed in. But I agree, I think it's a good idea for the directory to be explicitly created at the beginning and removed at the end. Unfortunately, I currently do not have access to a developer environment. Would anyone be interested in finishing up this ticket? > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, > HIVE-14373.04.patch, HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15447371#comment-15447371 ] Abdullah Yousufi commented on HIVE-14373: - How would this mkdir and rmdir on S3 be happening? Is there a way to automate this as part of the testing rather than having the user do it manually? > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, > HIVE-14373.04.patch, HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14272) ConditionalResolverMergeFiles should keep staging data on HDFS, then copy (no rename) to S3
[ https://issues.apache.org/jira/browse/HIVE-14272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14272: Assignee: Sergio Peña (was: Abdullah Yousufi) > ConditionalResolverMergeFiles should keep staging data on HDFS, then copy (no > rename) to S3 > --- > > Key: HIVE-14272 > URL: https://issues.apache.org/jira/browse/HIVE-14272 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Sergio Peña > > If {{hive.merge.mapfiles}} is True, and the output table to write is on S3, > then Hive will generate a conditional plan where smaller files will be merged > into larger sizes. > If the output files written by the initial MR job are small, then a second MR > job is run to merge the output into larger files (a copy from S3 to S3 in the > current code). > If the original output files are large enough, then the conditional task is > followed by a move/rename which is very expensive in S3. > We should keep staging data on HDFS previous to copying them to S3 as final > files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination
[ https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14271: Assignee: Sergio Peña (was: Abdullah Yousufi) > FileSinkOperator should not rename files to final paths when S3 is the > default destination > -- > > Key: HIVE-14271 > URL: https://issues.apache.org/jira/browse/HIVE-14271 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Sergio Peña > > FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished > writing all rows to a temporary path. The problem is that S3 does not support > renaming. > Two options can be considered: > a. Use a copy operation instead. After FileSinkOperator writes all rows to > outPaths, then the commit method will do a copy() call instead of move(). > b. Write row by row directly to the S3 path (see HIVE-1620). This may add > better performance calls, but we should take care of the cleanup part in case > of writing errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14165) Remove Hive file listing during split computation
[ https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14165: Attachment: HIVE-14165.03.patch > Remove Hive file listing during split computation > - > > Key: HIVE-14165 > URL: https://issues.apache.org/jira/browse/HIVE-14165 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Attachments: HIVE-14165.02.patch, HIVE-14165.03.patch, > HIVE-14165.patch > > > The Hive side listing in FetchOperator.java is unnecessary, since Hadoop's > FileInputFormat.java will list the files during split computation anyway to > determine their size. One way to remove this is to catch the > InvalidInputFormat exception thrown by FileInputFormat#getSplits() on the > Hive side instead of doing the file listing beforehand. > For S3 select queries on partitioned tables, this results in a 2x speedup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14373: Attachment: HIVE-14373.04.patch > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, > HIVE-14373.04.patch, HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15428807#comment-15428807 ] Abdullah Yousufi commented on HIVE-14373: - Actually appending the timestamp to the beginning would be an issue due to table names then being needed to be masked in the output files. I think we need to rethink how we want to resolve parallel execution and source/output tables. I may revert these changes for now. > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, > HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15428752#comment-15428752 ] Abdullah Yousufi commented on HIVE-14373: - [~yalovyyi] So one issue right now I'm noticing due is that after the tests are finished, the source and output directories still exist on the S3 bucket. Any thoughts on how to resolve this? One option I'm considering is to create the test ables on the bucket's root directory, so that there are no left over directories. In this case, the timestamp could be appended to the beginning of the table name to prevent issues with parallel test execution. > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, > HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15428752#comment-15428752 ] Abdullah Yousufi edited comment on HIVE-14373 at 8/19/16 8:12 PM: -- [~yalovyyi] So one issue I'm noticing is that after the tests are finished, the source and output directories still exist on the S3 bucket. Any thoughts on how to resolve this? One option I'm considering is to create the test tables on the bucket's root directory, so that there are no left over directories afterwards. In this case, the timestamp could be appended to the beginning of the table name to prevent issues with parallel test execution. was (Author: ayousufi): [~yalovyyi] So one issue right now I'm noticing due is that after the tests are finished, the source and output directories still exist on the S3 bucket. Any thoughts on how to resolve this? One option I'm considering is to create the test ables on the bucket's root directory, so that there are no left over directories. In this case, the timestamp could be appended to the beginning of the table name to prevent issues with parallel test execution. > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, > HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14560) Support exchange partition between s3 and hdfs tables
[ https://issues.apache.org/jira/browse/HIVE-14560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14560: Attachment: (was: HIVE-14560.02.patch) > Support exchange partition between s3 and hdfs tables > - > > Key: HIVE-14560 > URL: https://issues.apache.org/jira/browse/HIVE-14560 > Project: Hive > Issue Type: Bug >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Fix For: 2.2.0 > > Attachments: HIVE-14560.patch > > > {code} > alter table s3_tbl exchange partition (country='USA', state='CA') with table > hdfs_tbl; > {code} > results in: > {code} > Error: Error while processing statement: FAILED: Execution Error, return code > 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got > exception: java.lang.IllegalArgumentException Wrong FS: > s3a://hive-on-s3/s3_tbl/country=USA/state=CA, expected: > hdfs://localhost:9000) (state=08S01,code=1) > {code} > because the check for whether the s3 destination table path exists occurs on > the hdfs filesystem. > Furthermore, exchanging between s3 to hdfs fails because the hdfs rename > operation is not supported across filesystems. Fix uses copy + deletion in > the case that the file systems differ. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14165) Remove Hive file listing during split computation
[ https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14165: Attachment: (was: HIVE-14165.02.patch) > Remove Hive file listing during split computation > - > > Key: HIVE-14165 > URL: https://issues.apache.org/jira/browse/HIVE-14165 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Attachments: HIVE-14165.patch > > > The Hive side listing in FetchOperator.java is unnecessary, since Hadoop's > FileInputFormat.java will list the files during split computation anyway to > determine their size. One way to remove this is to catch the > InvalidInputFormat exception thrown by FileInputFormat#getSplits() on the > Hive side instead of doing the file listing beforehand. > For S3 select queries on partitioned tables, this results in a 2x speedup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14560) Support exchange partition between s3 and hdfs tables
[ https://issues.apache.org/jira/browse/HIVE-14560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14560: Attachment: HIVE-14560.02.patch > Support exchange partition between s3 and hdfs tables > - > > Key: HIVE-14560 > URL: https://issues.apache.org/jira/browse/HIVE-14560 > Project: Hive > Issue Type: Bug >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Fix For: 2.2.0 > > Attachments: HIVE-14560.02.patch, HIVE-14560.patch > > > {code} > alter table s3_tbl exchange partition (country='USA', state='CA') with table > hdfs_tbl; > {code} > results in: > {code} > Error: Error while processing statement: FAILED: Execution Error, return code > 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got > exception: java.lang.IllegalArgumentException Wrong FS: > s3a://hive-on-s3/s3_tbl/country=USA/state=CA, expected: > hdfs://localhost:9000) (state=08S01,code=1) > {code} > because the check for whether the s3 destination table path exists occurs on > the hdfs filesystem. > Furthermore, exchanging between s3 to hdfs fails because the hdfs rename > operation is not supported across filesystems. Fix uses copy + deletion in > the case that the file systems differ. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14373: Attachment: (was: HIVE-14373.03.patch) > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, > HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14373: Attachment: HIVE-14373.03.patch > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, > HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14165) Remove Hive file listing during split computation
[ https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14165: Attachment: HIVE-14165.02.patch > Remove Hive file listing during split computation > - > > Key: HIVE-14165 > URL: https://issues.apache.org/jira/browse/HIVE-14165 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Attachments: HIVE-14165.02.patch, HIVE-14165.patch > > > The Hive side listing in FetchOperator.java is unnecessary, since Hadoop's > FileInputFormat.java will list the files during split computation anyway to > determine their size. One way to remove this is to catch the > InvalidInputFormat exception thrown by FileInputFormat#getSplits() on the > Hive side instead of doing the file listing beforehand. > For S3 select queries on partitioned tables, this results in a 2x speedup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14560) Support exchange partition between s3 and hdfs tables
[ https://issues.apache.org/jira/browse/HIVE-14560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14560: Attachment: (was: HIVE-14560.02.patch) > Support exchange partition between s3 and hdfs tables > - > > Key: HIVE-14560 > URL: https://issues.apache.org/jira/browse/HIVE-14560 > Project: Hive > Issue Type: Bug >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Fix For: 2.2.0 > > Attachments: HIVE-14560.02.patch, HIVE-14560.patch > > > {code} > alter table s3_tbl exchange partition (country='USA', state='CA') with table > hdfs_tbl; > {code} > results in: > {code} > Error: Error while processing statement: FAILED: Execution Error, return code > 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got > exception: java.lang.IllegalArgumentException Wrong FS: > s3a://hive-on-s3/s3_tbl/country=USA/state=CA, expected: > hdfs://localhost:9000) (state=08S01,code=1) > {code} > because the check for whether the s3 destination table path exists occurs on > the hdfs filesystem. > Furthermore, exchanging between s3 to hdfs fails because the hdfs rename > operation is not supported across filesystems. Fix uses copy + deletion in > the case that the file systems differ. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14560) Support exchange partition between s3 and hdfs tables
[ https://issues.apache.org/jira/browse/HIVE-14560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14560: Attachment: HIVE-14560.02.patch > Support exchange partition between s3 and hdfs tables > - > > Key: HIVE-14560 > URL: https://issues.apache.org/jira/browse/HIVE-14560 > Project: Hive > Issue Type: Bug >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Fix For: 2.2.0 > > Attachments: HIVE-14560.02.patch, HIVE-14560.patch > > > {code} > alter table s3_tbl exchange partition (country='USA', state='CA') with table > hdfs_tbl; > {code} > results in: > {code} > Error: Error while processing statement: FAILED: Execution Error, return code > 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got > exception: java.lang.IllegalArgumentException Wrong FS: > s3a://hive-on-s3/s3_tbl/country=USA/state=CA, expected: > hdfs://localhost:9000) (state=08S01,code=1) > {code} > because the check for whether the s3 destination table path exists occurs on > the hdfs filesystem. > Furthermore, exchanging between s3 to hdfs fails because the hdfs rename > operation is not supported across filesystems. Fix uses copy + deletion in > the case that the file systems differ. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14560) Support exchange partition between s3 and hdfs tables
[ https://issues.apache.org/jira/browse/HIVE-14560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14560: Attachment: HIVE-14560.02.patch Removed whitespace lines modified by editor + change FileUtils.copy() to copy() > Support exchange partition between s3 and hdfs tables > - > > Key: HIVE-14560 > URL: https://issues.apache.org/jira/browse/HIVE-14560 > Project: Hive > Issue Type: Bug >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Fix For: 2.2.0 > > Attachments: HIVE-14560.02.patch, HIVE-14560.patch > > > {code} > alter table s3_tbl exchange partition (country='USA', state='CA') with table > hdfs_tbl; > {code} > results in: > {code} > Error: Error while processing statement: FAILED: Execution Error, return code > 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got > exception: java.lang.IllegalArgumentException Wrong FS: > s3a://hive-on-s3/s3_tbl/country=USA/state=CA, expected: > hdfs://localhost:9000) (state=08S01,code=1) > {code} > because the check for whether the s3 destination table path exists occurs on > the hdfs filesystem. > Furthermore, exchanging between s3 to hdfs fails because the hdfs rename > operation is not supported across filesystems. Fix uses copy + deletion in > the case that the file systems differ. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14165) Remove Hive file listing during split computation
[ https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14165: Attachment: HIVE-14165.02.patch > Remove Hive file listing during split computation > - > > Key: HIVE-14165 > URL: https://issues.apache.org/jira/browse/HIVE-14165 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Attachments: HIVE-14165.02.patch, HIVE-14165.patch > > > The Hive side listing in FetchOperator.java is unnecessary, since Hadoop's > FileInputFormat.java will list the files during split computation anyway to > determine their size. One way to remove this is to catch the > InvalidInputFormat exception thrown by FileInputFormat#getSplits() on the > Hive side instead of doing the file listing beforehand. > For S3 select queries on partitioned tables, this results in a 2x speedup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15427324#comment-15427324 ] Abdullah Yousufi edited comment on HIVE-14373 at 8/18/16 10:54 PM: --- Hey everyone, I've updated RB with this patch, which takes into account HIVE-1's removal of vm files. I've also added features like source and output table paths, as well as making the bucket path a property to pass in when running the test. Major thanks to [~yalovyyi] for the reference patch and everyone else for their feedback so far. was (Author: ayousufi): Hey everyone, I've updated RB with this patch, which takes into account HIVE-1's removal of vm files. I've also added features like source and output table paths, as well as making the bucket path a property to pass in when running the test. Major thanks to [~yalovyyi]'s for the reference patch and everyone else for their feedback so far. > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, > HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14373: Attachment: HIVE-14373.03.patch Hey everyone, I've updated RB with this patch, which takes into account HIVE-1's removal of vm files. I've also added features like source and output table paths, as well as making the bucket path a property to pass in when running the test. Major thanks to [~yalovyyi]'s for the reference patch and everyone else for their feedback so far. > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, > HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14165) Remove Hive file listing during split computation
[ https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15426846#comment-15426846 ] Abdullah Yousufi commented on HIVE-14165: - I believe when Hive calls getSplits() it's actually using {code}org.apache.hadoop.mapred.FileInputFormat{code}. And is the updated listStatus faster in the non-recursive case as well? Because if not, I think it doesn't make sense to pass in the recursive flag as true since Hive is only interested in the files in the top level of the path, since it currently calls getSplits() for each partition. However, if Hive were changed to call getSplits() on the root directory in the partitioned case, then the listStatus(recursive) would make sense. I decided against this change because I was not sure how to best resolve partition elimination. For example if the query selects a single partition from a table, then doing the listStatus(recursive) on the root directory would be slower than just doing a listStatus on the single partition. Also, Qubole mentions the following, which may be something to pursue in the future. {code} "we modified split computation to invoke listing at the level of the parent directory. This call returns all files (and their sizes) in all subdirectories in blocks of 1000. Some subdirectories and files may not be of interest to job/query e.g. partition elimination may be eliminated some of them. We take advantage of the fact that file listing is in lexicographic order and perform a modified merge join of the list of files and list of directories of interest." {code} When you mentioned earlier that Hadoop grabs 5000 objects at a time, is that including files in subdirectories? > Remove Hive file listing during split computation > - > > Key: HIVE-14165 > URL: https://issues.apache.org/jira/browse/HIVE-14165 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Attachments: HIVE-14165.patch > > > The Hive side listing in FetchOperator.java is unnecessary, since Hadoop's > FileInputFormat.java will list the files during split computation anyway to > determine their size. One way to remove this is to catch the > InvalidInputFormat exception thrown by FileInputFormat#getSplits() on the > Hive side instead of doing the file listing beforehand. > For S3 select queries on partitioned tables, this results in a 2x speedup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14165) Remove Hive file listing during split computation
[ https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14165: Attachment: HIVE-14165.patch > Remove Hive file listing during split computation > - > > Key: HIVE-14165 > URL: https://issues.apache.org/jira/browse/HIVE-14165 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Attachments: HIVE-14165.patch > > > The Hive side listing in FetchOperator.java is unnecessary, since Hadoop's > FileInputFormat.java will list the files during split computation anyway to > determine their size. One way to remove this is to catch the > InvalidInputFormat exception thrown by FileInputFormat#getSplits() on the > Hive side instead of doing the file listing beforehand. > For S3 select queries on partitioned tables, this results in a 2x speedup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14165) Remove Hive file listing during split computation
[ https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14165: Description: The Hive side listing in FetchOperator.java is unnecessary, since Hadoop's FileInputFormat.java will list the files during split computation anyway to determine their size. One way to remove this is to catch the InvalidInputFormat exception thrown by FileInputFormat#getSplits() on the Hive side instead of doing the file listing beforehand. For S3 select queries on partitioned tables, this results in a 2x speedup. was: The Hive side listing in FetchOperator.java is unnecessary, since Hadoop's FileInputFormat.java will list the files during split computation anyway to determine their size. One way to remove this is to catch the InvalidInputFormat exception FileInputFormat#getSplits() on the Hive side instead of doing the file listing beforehand. For S3 select queries on partitioned tables, this results in a 2x speedup. > Remove Hive file listing during split computation > - > > Key: HIVE-14165 > URL: https://issues.apache.org/jira/browse/HIVE-14165 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > > The Hive side listing in FetchOperator.java is unnecessary, since Hadoop's > FileInputFormat.java will list the files during split computation anyway to > determine their size. One way to remove this is to catch the > InvalidInputFormat exception thrown by FileInputFormat#getSplits() on the > Hive side instead of doing the file listing beforehand. > For S3 select queries on partitioned tables, this results in a 2x speedup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14165) Remove Hive file listing during split computation
[ https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14165: Status: Patch Available (was: Open) > Remove Hive file listing during split computation > - > > Key: HIVE-14165 > URL: https://issues.apache.org/jira/browse/HIVE-14165 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > > The Hive side listing in FetchOperator.java is unnecessary, since Hadoop's > FileInputFormat.java will list the files during split computation anyway to > determine their size. One way to remove this is to catch the > InvalidInputFormat exception FileInputFormat#getSplits() on the Hive side > instead of doing the file listing beforehand. > For S3 select queries on partitioned tables, this results in a 2x speedup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14165) Remove Hive file listing during split computation
[ https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14165: Description: The Hive side listing in FetchOperator.java is unnecessary, since Hadoop's FileInputFormat.java will list the files during split computation anyway to determine their size. One way to remove this is to catch the InvalidInputFormat exception FileInputFormat#getSplits() on the Hive side instead of doing the file listing beforehand. For S3 select queries on partitioned tables, this results in a 2x speedup. was:Split size computation be may improved by the optimizations for listFiles() in HADOOP-13208 > Remove Hive file listing during split computation > - > > Key: HIVE-14165 > URL: https://issues.apache.org/jira/browse/HIVE-14165 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > > The Hive side listing in FetchOperator.java is unnecessary, since Hadoop's > FileInputFormat.java will list the files during split computation anyway to > determine their size. One way to remove this is to catch the > InvalidInputFormat exception FileInputFormat#getSplits() on the Hive side > instead of doing the file listing beforehand. > For S3 select queries on partitioned tables, this results in a 2x speedup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14165) Remove Hive file listing during split computation
[ https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14165: Summary: Remove Hive file listing during split computation (was: Enable faster S3 Split Computation) > Remove Hive file listing during split computation > - > > Key: HIVE-14165 > URL: https://issues.apache.org/jira/browse/HIVE-14165 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > > Split size computation be may improved by the optimizations for listFiles() > in HADOOP-13208 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14165) Enable faster S3 Split Computation
[ https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425825#comment-15425825 ] Abdullah Yousufi commented on HIVE-14165: - Actually, on closer look, FileInputFormat's listStatus specifically returns an InvalidInputFormat exception in those two cases, instead of an IO exception, so I can catch that. > Enable faster S3 Split Computation > -- > > Key: HIVE-14165 > URL: https://issues.apache.org/jira/browse/HIVE-14165 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > > Split size computation be may improved by the optimizations for listFiles() > in HADOOP-13208 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14560) Support exchange partition between s3 and hdfs tables
[ https://issues.apache.org/jira/browse/HIVE-14560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14560: Status: Patch Available (was: Open) > Support exchange partition between s3 and hdfs tables > - > > Key: HIVE-14560 > URL: https://issues.apache.org/jira/browse/HIVE-14560 > Project: Hive > Issue Type: Bug >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Fix For: 2.2.0 > > Attachments: HIVE-14560.patch > > > {code} > alter table s3_tbl exchange partition (country='USA', state='CA') with table > hdfs_tbl; > {code} > results in: > {code} > Error: Error while processing statement: FAILED: Execution Error, return code > 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got > exception: java.lang.IllegalArgumentException Wrong FS: > s3a://hive-on-s3/s3_tbl/country=USA/state=CA, expected: > hdfs://localhost:9000) (state=08S01,code=1) > {code} > because the check for whether the s3 destination table path exists occurs on > the hdfs filesystem. > Furthermore, exchanging between s3 to hdfs fails because the hdfs rename > operation is not supported across filesystems. Fix uses copy + deletion in > the case that the file systems differ. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14560) Support exchange partition between s3 and hdfs tables
[ https://issues.apache.org/jira/browse/HIVE-14560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14560: Attachment: HIVE-14560.patch > Support exchange partition between s3 and hdfs tables > - > > Key: HIVE-14560 > URL: https://issues.apache.org/jira/browse/HIVE-14560 > Project: Hive > Issue Type: Bug >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Fix For: 2.2.0 > > Attachments: HIVE-14560.patch > > > {code} > alter table s3_tbl exchange partition (country='USA', state='CA') with table > hdfs_tbl; > {code} > results in: > {code} > Error: Error while processing statement: FAILED: Execution Error, return code > 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got > exception: java.lang.IllegalArgumentException Wrong FS: > s3a://hive-on-s3/s3_tbl/country=USA/state=CA, expected: > hdfs://localhost:9000) (state=08S01,code=1) > {code} > because the check for whether the s3 destination table path exists occurs on > the hdfs filesystem. > Furthermore, exchanging between s3 to hdfs fails because the hdfs rename > operation is not supported across filesystems. Fix uses copy + deletion in > the case that the file systems differ. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15423398#comment-15423398 ] Abdullah Yousufi commented on HIVE-14373: - Hey [~kgyrtkirk], thanks for letting us know about this. I took a look at your change and am working to adapt this patch with your changes. The next patch I upload on this patch's RB (https://reviews.apache.org/r/50938/) will not have vm files in it, so I'll let you know if I have any questions and will appreciate your feedback on that. > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.02.patch, HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14165) Enable faster S3 Split Computation
[ https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15423120#comment-15423120 ] Abdullah Yousufi commented on HIVE-14165: - It calls FileSystem.java#listStatus(Path p, PathFilter filter). And that's correct, it verifies that there is at least one FileStatus under the current path, at which point it begins the logic of determining splits, primarily by calling InputFormat#getSplits(JobConf job, int numSplits). But FileInputFormat#getSplits(JobContext job) is going to call listStatus() anyway. When I remove this listing, I get a 2x speed increase in a 500 partions S3 table. Could FileInputFormat#getSplits(job) be modified to short-circuit return a FileNotFound Exception in the cases of a non-existent path and 0 files found, so that Hive could catch that and continue? > Enable faster S3 Split Computation > -- > > Key: HIVE-14165 > URL: https://issues.apache.org/jira/browse/HIVE-14165 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > > Split size computation be may improved by the optimizations for listFiles() > in HADOOP-13208 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14165) Enable faster S3 Split Computation
[ https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421901#comment-15421901 ] Abdullah Yousufi commented on HIVE-14165: - So I did try the listFiles() optimization locally and modified Hive to call the function on the root directory of a partitioned table. While this does give a speedup for a select * query on a partitioned table, this approach is not really extensible to queries that do partition elimination, since in those cases it makes sense to just pass in the relevant partitions, as Hive currently does. I'm thinking it might make sense to remove the following list call on Hive in the case of S3 partitioned tables since the listing for the split computation is going to happen later anyway in Hadoop's FileInputFormat.java. FetchOperator.java#getNextPath() {code} if (fs.exists(currPath)) { for (FileStatus fStat : listStatusUnderPath(fs, currPath)) { if (fStat.getLen() > 0) { return true; } } } {code} My question is if it sounds good to remove this check. It seems that there may be errors that FileInputFormat.java#getSplits() may return if the partition directory does not have any files, but is there a better way to handle that? > Enable faster S3 Split Computation > -- > > Key: HIVE-14165 > URL: https://issues.apache.org/jira/browse/HIVE-14165 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > > Split size computation be may improved by the optimizations for listFiles() > in HADOOP-13208 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14373: Attachment: HIVE-14373.02.patch > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.02.patch, HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417975#comment-15417975 ] Abdullah Yousufi commented on HIVE-14373: - Sure that sounds great! You could email me the patch if that works for you. > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416168#comment-15416168 ] Abdullah Yousufi commented on HIVE-14373: - Hey [~poeppt], that's a pretty good idea and also the ideal behavior for blobstore testing. We'd need to investigate how that would work and what changes would be necessary on hiveqa. Perhaps it would make more sense as a follow up ticket once this change is in? Let me know if that sounds reasonable. Also, thanks for your reviews on the patch as well! > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15415622#comment-15415622 ] Abdullah Yousufi commented on HIVE-14373: - Hey [~yalovyyi], currently the best way to switch between different s3 clients would be to use the different key names in auth-keys.xml. I created auth-keys.xml.template as an s3a example, but that could be easily changed for s3n. However, I agree that the bucket variable name in that file should not be specific to s3a. Also thanks a ton for the review on the patch, I'll get to that shortly. > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14373: Attachment: HIVE-14373.patch > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14373: Attachment: (was: HIVE-14373.patch) > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14373: Attachment: HIVE-14373.patch > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14373: Status: Patch Available (was: Open) > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi reassigned HIVE-14373: --- Assignee: Abdullah Yousufi > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-14272) ConditionalResolverMergeFiles should keep staging data on HDFS, then copy (no rename) to S3
[ https://issues.apache.org/jira/browse/HIVE-14272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi reassigned HIVE-14272: --- Assignee: Abdullah Yousufi (was: Sergio Peña) > ConditionalResolverMergeFiles should keep staging data on HDFS, then copy (no > rename) to S3 > --- > > Key: HIVE-14272 > URL: https://issues.apache.org/jira/browse/HIVE-14272 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > > If {{hive.merge.mapfiles}} is True, and the output table to write is on S3, > then Hive will generate a conditional plan where smaller files will be merged > into larger sizes. > If the output files written by the initial MR job are small, then a second MR > job is run to merge the output into larger files (a copy from S3 to S3 in the > current code). > If the original output files are large enough, then the conditional task is > followed by a move/rename which is very expensive in S3. > We should keep staging data on HDFS previous to copying them to S3 as final > files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14165) Enable faster S3 Split Computation
[ https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14165: Description: Split size computation be may improved by the optimizations for listFiles() in HADOOP-13208 (was: During split computation when a large number of files are required to be listed from S3, instead of executing 1 API call per file, one can optimize by listing 1000 files in each API call. This would reduce the amount of time required for listing files. Qubole has this optimization in place as detailed here: https://www.qubole.com/blog/product/optimizing-hadoop-for-s3-part-1/?nabe=5695374637924352:0) > Enable faster S3 Split Computation > -- > > Key: HIVE-14165 > URL: https://issues.apache.org/jira/browse/HIVE-14165 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > > Split size computation be may improved by the optimizations for listFiles() > in HADOOP-13208 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14165) Enable faster S3 Split Computation
[ https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14165: Summary: Enable faster S3 Split Computation (was: Enable faster S3 Split Computation by listing files in blocks) > Enable faster S3 Split Computation > -- > > Key: HIVE-14165 > URL: https://issues.apache.org/jira/browse/HIVE-14165 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > > During split computation when a large number of files are required to be > listed from S3, instead of executing 1 API call per file, one can optimize by > listing 1000 files in each API call. This would reduce the amount of time > required for listing files. > Qubole has this optimization in place as detailed here: > https://www.qubole.com/blog/product/optimizing-hadoop-for-s3-part-1/?nabe=5695374637924352:0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14165) Enable faster S3 Split Computation by listing files in blocks
[ https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15392423#comment-15392423 ] Abdullah Yousufi commented on HIVE-14165: - Thanks for the clarification Steve, looking forward to that O(files/1000) recursive list > Enable faster S3 Split Computation by listing files in blocks > - > > Key: HIVE-14165 > URL: https://issues.apache.org/jira/browse/HIVE-14165 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > > During split computation when a large number of files are required to be > listed from S3, instead of executing 1 API call per file, one can optimize by > listing 1000 files in each API call. This would reduce the amount of time > required for listing files. > Qubole has this optimization in place as detailed here: > https://www.qubole.com/blog/product/optimizing-hadoop-for-s3-part-1/?nabe=5695374637924352:0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14301) insert overwrite fails for nonpartitioned tables in s3
[ https://issues.apache.org/jira/browse/HIVE-14301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388422#comment-15388422 ] Abdullah Yousufi commented on HIVE-14301: - [~ste...@apache.org] The issue is that line 672 of S3AFileSystem.java returns false in the innerRename function since the dstKey is "": {code} if (srcKey.isEmpty() || dstKey.isEmpty()) { LOG.debug("rename: source {} or dest {}, is empty", srcKey, dstKey); return false; } {code} Is there a reason for this check, especially with regards to the destination key? > insert overwrite fails for nonpartitioned tables in s3 > -- > > Key: HIVE-14301 > URL: https://issues.apache.org/jira/browse/HIVE-14301 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Fix For: 2.2.0, 2.1.1 > > Attachments: HIVE-14301.1.patch > > > {noformat} > hive> insert overwrite table s3_2 select * from default.test2; > Query ID = hrt_qa_20160719164737_90fb1f30-0ade-4a64-ab65-a6a7550be25a > Total jobs = 1 > Launching Job 1 out of 1 > Status: Running (Executing on YARN cluster with App id > application_1468941549982_0010) > > VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED > KILLED > > Map 1 .. SUCCEEDED 1 100 0 > 0 > > VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 11.90 s > > > Loading data to table default.s3_2 > Failed with exception java.io.IOException: rename for src path: > s3a://test-ks/test2/.hive-staging_hive_2016-07-19_16-47-37_787_4725676452829013403-1/-ext-1/00_0.deflate > to dest path:s3a://test-ks/test2/00_0.deflate returned false > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > 2016-07-19 16:43:46,244 ERROR [main]: exec.Task > (SessionState.java:printError(948)) - Failed with exception > java.io.IOException: rename for src path: > s3a://test-ks/testing/.hive-staging_hive_2016-07-19_16-42-20_739_1716954454570249450-1/-ext-1/00_0.deflate > to dest path:s3a://test-ks/testing/00_0.deflate returned false > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: rename > for src path: > s3a://test-ks/testing/.hive-staging_hive_2016-07-19_16-42-20_739_1716954454570249450-1/-ext-1/00_0.deflate > to dest path:s3a://test-ks/testing/00_0.deflate returned false > at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2856) > at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3113) > at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1700) > at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:328) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1726) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1472) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1271) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1138) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1128) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:216) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:168) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:379) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:739) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:624) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > Caused by: java.io.IOException: rename for src path: > s3a://test-ks/testing/.hive-staging_hive_2016-07-19_16-42-20_739_1716954454570249450-1/-ext-1/00_0.deflate > to dest path:s3a://test-ks/testing/00_0.deflate returned false >
[jira] [Comment Edited] (HIVE-14301) insert overwrite fails for nonpartitioned tables in s3
[ https://issues.apache.org/jira/browse/HIVE-14301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388371#comment-15388371 ] Abdullah Yousufi edited comment on HIVE-14301 at 7/21/16 8:30 PM: -- {noformat} 2016-07-21T12:16:55,698 ERROR [515de560-6446-44e6-a9f3-c53dc628e357 main] exec.Task: Failed with exception Error moving: s3a://dev-ayousufi/.hive-staging_hive_2016-07-21_12-16-28_703_1653610595303080982-1/-ext-1 into: s3a://dev-ayousufi/ org.apache.hadoop.hive.ql.metadata.HiveException: Error moving: s3a://dev-ayousufi/.hive-staging_hive_2016-07-21_12-16-28_703_1653610595303080982-1/-ext-1 into: s3a://dev-ayousufi/ at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3248) at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1817) at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:356) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1870) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1574) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1326) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1095) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1083) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:239) at org.apache.hadoop.util.RunJar.main(RunJar.java:153) Caused by: java.io.IOException: Error moving: s3a://dev-ayousufi/.hive-staging_hive_2016-07-21_12-16-28_703_1653610595303080982-1/-ext-1 into: s3a://dev-ayousufi/ at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3238) ... 21 more 2016-07-21T12:16:55,698 ERROR [515de560-6446-44e6-a9f3-c53dc628e357 main] ql.Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Error moving: s3a://dev-ayousufi/.hive-staging_hive_2016-07-21_12-16-28_703_1653610595303080982-1/-ext-1 into: s3a://dev-ayousufi/ {noformat} was (Author: ayousufi): 2016-07-21T12:16:55,698 ERROR [515de560-6446-44e6-a9f3-c53dc628e357 main] exec.Task: Failed with exception Error moving: s3a://dev-ayousufi/.hive-staging_hive_2016-07-21_12-16-28_703_1653610595303080982-1/-ext-1 into: s3a://dev-ayousufi/ org.apache.hadoop.hive.ql.metadata.HiveException: Error moving: s3a://dev-ayousufi/.hive-staging_hive_2016-07-21_12-16-28_703_1653610595303080982-1/-ext-1 into: s3a://dev-ayousufi/ at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3248) at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1817) at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:356) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1870) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1574) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1326) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1095) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1083) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at
[jira] [Commented] (HIVE-14301) insert overwrite fails for nonpartitioned tables in s3
[ https://issues.apache.org/jira/browse/HIVE-14301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388371#comment-15388371 ] Abdullah Yousufi commented on HIVE-14301: - 2016-07-21T12:16:55,698 ERROR [515de560-6446-44e6-a9f3-c53dc628e357 main] exec.Task: Failed with exception Error moving: s3a://dev-ayousufi/.hive-staging_hive_2016-07-21_12-16-28_703_1653610595303080982-1/-ext-1 into: s3a://dev-ayousufi/ org.apache.hadoop.hive.ql.metadata.HiveException: Error moving: s3a://dev-ayousufi/.hive-staging_hive_2016-07-21_12-16-28_703_1653610595303080982-1/-ext-1 into: s3a://dev-ayousufi/ at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3248) at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1817) at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:356) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1870) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1574) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1326) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1095) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1083) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:239) at org.apache.hadoop.util.RunJar.main(RunJar.java:153) Caused by: java.io.IOException: Error moving: s3a://dev-ayousufi/.hive-staging_hive_2016-07-21_12-16-28_703_1653610595303080982-1/-ext-1 into: s3a://dev-ayousufi/ at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3238) ... 21 more 2016-07-21T12:16:55,698 ERROR [515de560-6446-44e6-a9f3-c53dc628e357 main] ql.Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Error moving: s3a://dev-ayousufi/.hive-staging_hive_2016-07-21_12-16-28_703_1653610595303080982-1/-ext-1 into: s3a://dev-ayousufi/ > insert overwrite fails for nonpartitioned tables in s3 > -- > > Key: HIVE-14301 > URL: https://issues.apache.org/jira/browse/HIVE-14301 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Fix For: 2.2.0, 2.1.1 > > Attachments: HIVE-14301.1.patch > > > {noformat} > hive> insert overwrite table s3_2 select * from default.test2; > Query ID = hrt_qa_20160719164737_90fb1f30-0ade-4a64-ab65-a6a7550be25a > Total jobs = 1 > Launching Job 1 out of 1 > Status: Running (Executing on YARN cluster with App id > application_1468941549982_0010) > > VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED > KILLED > > Map 1 .. SUCCEEDED 1 100 0 > 0 > > VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 11.90 s > > > Loading data to table default.s3_2 > Failed with exception java.io.IOException: rename for src path: > s3a://test-ks/test2/.hive-staging_hive_2016-07-19_16-47-37_787_4725676452829013403-1/-ext-1/00_0.deflate > to dest path:s3a://test-ks/test2/00_0.deflate returned false > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > 2016-07-19 16:43:46,244 ERROR [main]: exec.Task > (SessionState.java:printError(948)) - Failed with exception > java.io.IOException: rename for src path: > s3a://test-ks/testing/.hive-staging_hive_2016-07-19_16-42-20_739_1716954454570249450-1/-ext-1/00_0.deflate > to dest
[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination
[ https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388230#comment-15388230 ] Abdullah Yousufi commented on HIVE-14271: - Agreed. I'll upload a patch for the second approach shortly. > FileSinkOperator should not rename files to final paths when S3 is the > default destination > -- > > Key: HIVE-14271 > URL: https://issues.apache.org/jira/browse/HIVE-14271 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > > FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished > writing all rows to a temporary path. The problem is that S3 does not support > renaming. > Two options can be considered: > a. Use a copy operation instead. After FileSinkOperator writes all rows to > outPaths, then the commit method will do a copy() call instead of move(). > b. Write row by row directly to the S3 path (see HIVE-1620). This may add > better performance calls, but we should take care of the cleanup part in case > of writing errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14301) insert overwrite fails for nonpartitioned tables in s3
[ https://issues.apache.org/jira/browse/HIVE-14301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388228#comment-15388228 ] Abdullah Yousufi commented on HIVE-14301: - not a complete stack trace (will post as soon as I locate the log file), but here's the output from running the query: {noformat} hive> insert overwrite table s3dummy values (1); WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Query ID = abdullah.yousufi_20160721114424_4c7b2d9d-88de-44f3-b2a1-ebc715e2dcbf Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Job running in-process (local Hadoop) 2016-07-21 11:44:32,888 Stage-1 map = 0%, reduce = 0% 2016-07-21 11:44:36,906 Stage-1 map = 100%, reduce = 0% Ended Job = job_local1593349645_0001 Stage-4 is selected by condition resolver. Stage-3 is filtered out by condition resolver. Stage-5 is filtered out by condition resolver. Moving data to directory s3a://dev-ayousufi/.hive-staging_hive_2016-07-21_11-44-24_590_2809548621359300870-1/-ext-1 Loading data to table default.s3dummy Failed with exception Error moving: s3a://dev-ayousufi/.hive-staging_hive_2016-07-21_11-44-24_590_2809548621359300870-1/-ext-1 into: s3a://dev-ayousufi/ FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Error moving: s3a://dev-ayousufi/.hive-staging_hive_2016-07-21_11-44-24_590_2809548621359300870-1/-ext-1 into: s3a://dev-ayousufi/ MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 5 HDFS Write: 5 SUCCESS Total MapReduce CPU Time Spent: 0 msec {noformat} > insert overwrite fails for nonpartitioned tables in s3 > -- > > Key: HIVE-14301 > URL: https://issues.apache.org/jira/browse/HIVE-14301 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Fix For: 2.2.0, 2.1.1 > > Attachments: HIVE-14301.1.patch > > > {noformat} > hive> insert overwrite table s3_2 select * from default.test2; > Query ID = hrt_qa_20160719164737_90fb1f30-0ade-4a64-ab65-a6a7550be25a > Total jobs = 1 > Launching Job 1 out of 1 > Status: Running (Executing on YARN cluster with App id > application_1468941549982_0010) > > VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED > KILLED > > Map 1 .. SUCCEEDED 1 100 0 > 0 > > VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 11.90 s > > > Loading data to table default.s3_2 > Failed with exception java.io.IOException: rename for src path: > s3a://test-ks/test2/.hive-staging_hive_2016-07-19_16-47-37_787_4725676452829013403-1/-ext-1/00_0.deflate > to dest path:s3a://test-ks/test2/00_0.deflate returned false > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > 2016-07-19 16:43:46,244 ERROR [main]: exec.Task > (SessionState.java:printError(948)) - Failed with exception > java.io.IOException: rename for src path: > s3a://test-ks/testing/.hive-staging_hive_2016-07-19_16-42-20_739_1716954454570249450-1/-ext-1/00_0.deflate > to dest path:s3a://test-ks/testing/00_0.deflate returned false > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: rename > for src path: > s3a://test-ks/testing/.hive-staging_hive_2016-07-19_16-42-20_739_1716954454570249450-1/-ext-1/00_0.deflate > to dest path:s3a://test-ks/testing/00_0.deflate returned false > at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2856) > at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3113) > at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1700) > at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:328) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1726) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1472) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1271) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1138) > at
[jira] [Commented] (HIVE-14301) insert overwrite fails for nonpartitioned tables in s3
[ https://issues.apache.org/jira/browse/HIVE-14301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388175#comment-15388175 ] Abdullah Yousufi commented on HIVE-14301: - Hey [~rajesh.balamohan] I was testing this and found that the query still fails if the table is created in the root directory of the bucket (at s3://bucket-name/). I don't know how common of a use case that is, but thought I'd mention it. > insert overwrite fails for nonpartitioned tables in s3 > -- > > Key: HIVE-14301 > URL: https://issues.apache.org/jira/browse/HIVE-14301 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Fix For: 2.2.0, 2.1.1 > > Attachments: HIVE-14301.1.patch > > > {noformat} > hive> insert overwrite table s3_2 select * from default.test2; > Query ID = hrt_qa_20160719164737_90fb1f30-0ade-4a64-ab65-a6a7550be25a > Total jobs = 1 > Launching Job 1 out of 1 > Status: Running (Executing on YARN cluster with App id > application_1468941549982_0010) > > VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED > KILLED > > Map 1 .. SUCCEEDED 1 100 0 > 0 > > VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 11.90 s > > > Loading data to table default.s3_2 > Failed with exception java.io.IOException: rename for src path: > s3a://test-ks/test2/.hive-staging_hive_2016-07-19_16-47-37_787_4725676452829013403-1/-ext-1/00_0.deflate > to dest path:s3a://test-ks/test2/00_0.deflate returned false > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > 2016-07-19 16:43:46,244 ERROR [main]: exec.Task > (SessionState.java:printError(948)) - Failed with exception > java.io.IOException: rename for src path: > s3a://test-ks/testing/.hive-staging_hive_2016-07-19_16-42-20_739_1716954454570249450-1/-ext-1/00_0.deflate > to dest path:s3a://test-ks/testing/00_0.deflate returned false > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: rename > for src path: > s3a://test-ks/testing/.hive-staging_hive_2016-07-19_16-42-20_739_1716954454570249450-1/-ext-1/00_0.deflate > to dest path:s3a://test-ks/testing/00_0.deflate returned false > at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2856) > at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3113) > at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1700) > at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:328) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1726) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1472) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1271) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1138) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1128) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:216) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:168) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:379) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:739) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:624) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > Caused by: java.io.IOException: rename for src path: > s3a://test-ks/testing/.hive-staging_hive_2016-07-19_16-42-20_739_1716954454570249450-1/-ext-1/00_0.deflate > to dest path:s3a://test-ks/testing/00_0.deflate returned false > at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2836) > at
[jira] [Commented] (HIVE-14074) RELOAD FUNCTION should update dropped functions
[ https://issues.apache.org/jira/browse/HIVE-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375929#comment-15375929 ] Abdullah Yousufi commented on HIVE-14074: - Yeah the regex will return all permanent UDFs, as the format for those functions is ".". This will exclude built-in and temporary functions, which should not be affected by the reload. > RELOAD FUNCTION should update dropped functions > --- > > Key: HIVE-14074 > URL: https://issues.apache.org/jira/browse/HIVE-14074 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Fix For: 2.2.0 > > Attachments: HIVE-14074.01.patch, HIVE-14074.02.patch, > HIVE-14074.03.patch > > > Due to HIVE-2573, functions are stored in a per-session registry and only > loaded in from the metastore when hs2 or hive cli is started. Running RELOAD > FUNCTION in the current session is a way to force a reload of the functions, > so that changes that occurred in other running sessions will be reflected in > the current session, without having to restart the current session. However, > while functions that are created in other sessions will now appear in the > current session, functions that have been dropped are not removed from the > current session's registry. It seems inconsistent that created functions are > updated while dropped functions are not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14074) RELOAD FUNCTION should update dropped functions
[ https://issues.apache.org/jira/browse/HIVE-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14074: Attachment: HIVE-14074.03.patch > RELOAD FUNCTION should update dropped functions > --- > > Key: HIVE-14074 > URL: https://issues.apache.org/jira/browse/HIVE-14074 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Fix For: 2.2.0 > > Attachments: HIVE-14074.01.patch, HIVE-14074.02.patch, > HIVE-14074.03.patch > > > Due to HIVE-2573, functions are stored in a per-session registry and only > loaded in from the metastore when hs2 or hive cli is started. Running RELOAD > FUNCTION in the current session is a way to force a reload of the functions, > so that changes that occurred in other running sessions will be reflected in > the current session, without having to restart the current session. However, > while functions that are created in other sessions will now appear in the > current session, functions that have been dropped are not removed from the > current session's registry. It seems inconsistent that created functions are > updated while dropped functions are not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14165) Enable faster S3 Split Computation by listing files in blocks
[ https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14165: Description: During split computation when a large number of files are required to be listed from S3, instead of executing 1 API call per file, one can optimize by listing 1000 files in each API call. This would reduce the amount of time required for listing files. Qubole has this optimization in place as detailed here: https://www.qubole.com/blog/product/optimizing-hadoop-for-s3-part-1/?nabe=5695374637924352:0 was: During split computation when a large of files are required to be listed from S3 then instead of executing 1 API call per file, one can optimize by listing 1000 files in each API call. Thereby reducing the amount of time required for listing files. Qubole has this optimization in place as detailed here: https://www.qubole.com/blog/product/optimizing-hadoop-for-s3-part-1/?nabe=5695374637924352:0 > Enable faster S3 Split Computation by listing files in blocks > - > > Key: HIVE-14165 > URL: https://issues.apache.org/jira/browse/HIVE-14165 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > > During split computation when a large number of files are required to be > listed from S3, instead of executing 1 API call per file, one can optimize by > listing 1000 files in each API call. This would reduce the amount of time > required for listing files. > Qubole has this optimization in place as detailed here: > https://www.qubole.com/blog/product/optimizing-hadoop-for-s3-part-1/?nabe=5695374637924352:0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14149) Joda Time causes an AmazonS3Exception on Hadoop3.0.0
[ https://issues.apache.org/jira/browse/HIVE-14149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14149: Attachment: HIVE-14149.1.patch > Joda Time causes an AmazonS3Exception on Hadoop3.0.0 > > > Key: HIVE-14149 > URL: https://issues.apache.org/jira/browse/HIVE-14149 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Attachments: HIVE-14149.1.patch > > > Java1.8u60 and higher cause Joda Time 2.5 to incorrectly format timezones, > which leads to the aws server rejecting requests with the aws sdk hadoop3.0 > uses. This means any queries involving the s3a connector will return the > following AmazonS3Exception: > {code} > com.amazonaws.services.s3.model.AmazonS3Exception: AWS authentication > requires a valid Date or x-amz-date header > {code} > The fix for this is to update Joda Time from 2.5 to 2.8.1. See here for > details: > https://github.com/aws/aws-sdk-java/issues/444 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14149) Joda Time causes an AmazonS3Exception on Hadoop3.0.0
[ https://issues.apache.org/jira/browse/HIVE-14149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14149: Status: Patch Available (was: Open) > Joda Time causes an AmazonS3Exception on Hadoop3.0.0 > > > Key: HIVE-14149 > URL: https://issues.apache.org/jira/browse/HIVE-14149 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Attachments: HIVE-14149.1.patch > > > Java1.8u60 and higher cause Joda Time 2.5 to incorrectly format timezones, > which leads to the aws server rejecting requests with the aws sdk hadoop3.0 > uses. This means any queries involving the s3a connector will return the > following AmazonS3Exception: > {code} > com.amazonaws.services.s3.model.AmazonS3Exception: AWS authentication > requires a valid Date or x-amz-date header > {code} > The fix for this is to update Joda Time from 2.5 to 2.8.1. See here for > details: > https://github.com/aws/aws-sdk-java/issues/444 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14149) Joda Time causes an AmazonS3Exception on Hadoop3.0.0
[ https://issues.apache.org/jira/browse/HIVE-14149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14149: Summary: Joda Time causes an AmazonS3Exception on Hadoop3.0.0 (was: s3a queries cause an AmazonS3Exception on Hadoop3.0 with Java1.8u60 and higher) > Joda Time causes an AmazonS3Exception on Hadoop3.0.0 > > > Key: HIVE-14149 > URL: https://issues.apache.org/jira/browse/HIVE-14149 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > > Java1.8u60 and higher cause Joda Time 2.5 to incorrectly format timezones, > which leads to the aws server rejecting requests with the aws sdk hadoop3.0 > uses. This means any queries involving the s3a connector will return the > following AmazonS3Exception: > {code} > com.amazonaws.services.s3.model.AmazonS3Exception: AWS authentication > requires a valid Date or x-amz-date header > {code} > The fix for this is to update Joda Time from 2.5 to 2.8.1. See here for > details: > https://github.com/aws/aws-sdk-java/issues/444 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14149) s3a queries cause an AmazonS3Exception on Hadoop3.0 with Java1.8u60 and higher
[ https://issues.apache.org/jira/browse/HIVE-14149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14149: Summary: s3a queries cause an AmazonS3Exception on Hadoop3.0 with Java1.8u60 and higher (was: S3A connector throws an AmazonS3Exception on Hadoop3.0 with Java1.8u60 and higher) > s3a queries cause an AmazonS3Exception on Hadoop3.0 with Java1.8u60 and higher > -- > > Key: HIVE-14149 > URL: https://issues.apache.org/jira/browse/HIVE-14149 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > > Java1.8u60 and higher cause Joda Time 2.5 to incorrectly format timezones, > which leads to the aws server rejecting requests with the aws sdk hadoop3.0 > uses. This means any queries involving the s3a connector will return the > following AmazonS3Exception: > {code} > com.amazonaws.services.s3.model.AmazonS3Exception: AWS authentication > requires a valid Date or x-amz-date header > {code} > The fix for this is to update Joda Time from 2.5 to 2.8.1. See here for > details: > https://github.com/aws/aws-sdk-java/issues/444 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15350486#comment-15350486 ] Abdullah Yousufi commented on HIVE-13964: - Yeah that could be helpful, thanks. --- props: url= user= password= driver= > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch, HIVE-13964.02.patch, > HIVE-13964.03.patch, HIVE-13964.04.patch, HIVE-13964.05.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15348649#comment-15348649 ] Abdullah Yousufi commented on HIVE-13964: - Would something like this work? - --property-file The file to read connection properties (url, driver, user, password) from. Usage: beeline --property-file props > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Labels: Docs > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch, HIVE-13964.02.patch, > HIVE-13964.03.patch, HIVE-13964.04.patch, HIVE-13964.05.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14074) RELOAD FUNCTION should update dropped functions
[ https://issues.apache.org/jira/browse/HIVE-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14074: Attachment: (was: HIVE-14074.02.patch) > RELOAD FUNCTION should update dropped functions > --- > > Key: HIVE-14074 > URL: https://issues.apache.org/jira/browse/HIVE-14074 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Fix For: 2.2.0 > > Attachments: HIVE-14074.01.patch, HIVE-14074.02.patch > > > Due to HIVE-2573, functions are stored in a per-session registry and only > loaded in from the metastore when hs2 or hive cli is started. Running RELOAD > FUNCTION in the current session is a way to force a reload of the functions, > so that changes that occurred in other running sessions will be reflected in > the current session, without having to restart the current session. However, > while functions that are created in other sessions will now appear in the > current session, functions that have been dropped are not removed from the > current session's registry. It seems inconsistent that created functions are > updated while dropped functions are not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14074) RELOAD FUNCTION should update dropped functions
[ https://issues.apache.org/jira/browse/HIVE-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14074: Attachment: HIVE-14074.02.patch reuploading patch 2, to see if the same tests fail again > RELOAD FUNCTION should update dropped functions > --- > > Key: HIVE-14074 > URL: https://issues.apache.org/jira/browse/HIVE-14074 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Fix For: 2.2.0 > > Attachments: HIVE-14074.01.patch, HIVE-14074.02.patch, > HIVE-14074.02.patch > > > Due to HIVE-2573, functions are stored in a per-session registry and only > loaded in from the metastore when hs2 or hive cli is started. Running RELOAD > FUNCTION in the current session is a way to force a reload of the functions, > so that changes that occurred in other running sessions will be reflected in > the current session, without having to restart the current session. However, > while functions that are created in other sessions will now appear in the > current session, functions that have been dropped are not removed from the > current session's registry. It seems inconsistent that created functions are > updated while dropped functions are not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14074) RELOAD FUNCTION should update dropped functions
[ https://issues.apache.org/jira/browse/HIVE-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14074: Attachment: HIVE-14074.02.patch > RELOAD FUNCTION should update dropped functions > --- > > Key: HIVE-14074 > URL: https://issues.apache.org/jira/browse/HIVE-14074 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Fix For: 2.2.0 > > Attachments: HIVE-14074.01.patch, HIVE-14074.02.patch > > > Due to HIVE-2573, functions are stored in a per-session registry and only > loaded in from the metastore when hs2 or hive cli is started. Running RELOAD > FUNCTION in the current session is a way to force a reload of the functions, > so that changes that occurred in other running sessions will be reflected in > the current session, without having to restart the current session. However, > while functions that are created in other sessions will now appear in the > current session, functions that have been dropped are not removed from the > current session's registry. It seems inconsistent that created functions are > updated while dropped functions are not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14074) RELOAD FUNCTION should update dropped functions
[ https://issues.apache.org/jira/browse/HIVE-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15344987#comment-15344987 ] Abdullah Yousufi commented on HIVE-14074: - Ah that's a good point. I had considered the set removal method, but was curious if this one-liner would work. I'll implement it that way instead. > RELOAD FUNCTION should update dropped functions > --- > > Key: HIVE-14074 > URL: https://issues.apache.org/jira/browse/HIVE-14074 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Fix For: 2.2.0 > > Attachments: HIVE-14074.01.patch > > > Due to HIVE-2573, functions are stored in a per-session registry and only > loaded in from the metastore when hs2 or hive cli is started. Running RELOAD > FUNCTION in the current session is a way to force a reload of the functions, > so that changes that occurred in other running sessions will be reflected in > the current session, without having to restart the current session. However, > while functions that are created in other sessions will now appear in the > current session, functions that have been dropped are not removed from the > current session's registry. It seems inconsistent that created functions are > updated while dropped functions are not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14074) RELOAD FUNCTION should update dropped functions
[ https://issues.apache.org/jira/browse/HIVE-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15344758#comment-15344758 ] Abdullah Yousufi commented on HIVE-14074: - No I don't believe that should be an issue because the function registry is local to the session. So other sessions won't experience any changes with their functions until they reload their functions. And when the other sessions do reload their functions, they will be reading from the metastore, which is thread safe. > RELOAD FUNCTION should update dropped functions > --- > > Key: HIVE-14074 > URL: https://issues.apache.org/jira/browse/HIVE-14074 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Fix For: 2.2.0 > > Attachments: HIVE-14074.01.patch > > > Due to HIVE-2573, functions are stored in a per-session registry and only > loaded in from the metastore when hs2 or hive cli is started. Running RELOAD > FUNCTION in the current session is a way to force a reload of the functions, > so that changes that occurred in other running sessions will be reflected in > the current session, without having to restart the current session. However, > while functions that are created in other sessions will now appear in the > current session, functions that have been dropped are not removed from the > current session's registry. It seems inconsistent that created functions are > updated while dropped functions are not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14074) RELOAD FUNCTION should update dropped functions
[ https://issues.apache.org/jira/browse/HIVE-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14074: Attachment: HIVE-14074.01.patch > RELOAD FUNCTION should update dropped functions > --- > > Key: HIVE-14074 > URL: https://issues.apache.org/jira/browse/HIVE-14074 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Fix For: 2.2.0 > > Attachments: HIVE-14074.01.patch > > > Due to HIVE-2573, functions are stored in a per-session registry and only > loaded in from the metastore when hs2 or hive cli is started. Running RELOAD > FUNCTION in the current session is a way to force a reload of the functions, > so that changes that occurred in other running sessions will be reflected in > the current session, without having to restart the current session. However, > while functions that are created in other sessions will now appear in the > current session, functions that have been dropped are not removed from the > current session's registry. It seems inconsistent that created functions are > updated while dropped functions are not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14074) RELOAD FUNCTION should update dropped functions
[ https://issues.apache.org/jira/browse/HIVE-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14074: Status: Patch Available (was: Open) > RELOAD FUNCTION should update dropped functions > --- > > Key: HIVE-14074 > URL: https://issues.apache.org/jira/browse/HIVE-14074 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > Fix For: 2.2.0 > > > Due to HIVE-2573, functions are stored in a per-session registry and only > loaded in from the metastore when hs2 or hive cli is started. Running RELOAD > FUNCTION in the current session is a way to force a reload of the functions, > so that changes that occurred in other running sessions will be reflected in > the current session, without having to restart the current session. However, > while functions that are created in other sessions will now appear in the > current session, functions that have been dropped are not removed from the > current session's registry. It seems inconsistent that created functions are > updated while dropped functions are not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14049) Password prompt in Beeline is continuously printed
[ https://issues.apache.org/jira/browse/HIVE-14049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-14049: Description: I'm experiencing this issue with a Mac, which was not occurring until recently. {code} Beeline version 2.2.0-SNAPSHOT by Apache Hive beeline> !connect jdbc:hive2://localhost:1 Connecting to jdbc:hive2://localhost:1 Enter username for jdbc:hive2://localhost:1: hive Enter password for jdbc:hive2://localhost:1: Enter password for jdbc:hive2://localhost:1: Enter password for jdbc:hive2://localhost:1: ... {code} The 'Enter password for jdbc:hive2://localhost:1:' line continues to print until enter is hit. From looking at the code in Commands.java (lines 1413-1420), it's not quite clear why this happens on the second call to readLine()) : {code} if (username == null) { username = beeLine.getConsoleReader().readLine("Enter username for " + url + ": "); } props.setProperty("user", username); if (password == null) { password = beeLine.getConsoleReader().readLine("Enter password for " + url + ": ", new Character('*')); } {code} was: I'm experiencing this issue with a Mac, which was not occurring until recently. {code} Beeline version 2.2.0-SNAPSHOT by Apache Hive beeline> !connect jdbc:hive2://localhost:1 Connecting to jdbc:hive2://localhost:1 Enter username for jdbc:hive2://localhost:1: hive Enter password for jdbc:hive2://localhost:1: Enter password for jdbc:hive2://localhost:1: Enter password for jdbc:hive2://localhost:1: ... {code} The 'Enter password for jdbc:hive2://localhost:1:' line continues to print until enter is hit. From looking at the code in Commands.java (lines 1413-1420), it's not quite clear why this happens on the second call to readLine()) : {code} if (username == null) { username = beeLine.getConsoleReader().readLine("Enter username for " + url + ": "); } props.setProperty("user", username); if (password == null) { password = beeLine.getConsoleReader().readLine("Enter password for " + url + ": ", new Character('*')); } {code} > Password prompt in Beeline is continuously printed > -- > > Key: HIVE-14049 > URL: https://issues.apache.org/jira/browse/HIVE-14049 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi > > I'm experiencing this issue with a Mac, which was not occurring until > recently. > {code} > Beeline version 2.2.0-SNAPSHOT by Apache Hive > beeline> !connect jdbc:hive2://localhost:1 > Connecting to jdbc:hive2://localhost:1 > Enter username for jdbc:hive2://localhost:1: hive > Enter password for jdbc:hive2://localhost:1: > Enter password for jdbc:hive2://localhost:1: > Enter password for jdbc:hive2://localhost:1: > ... > {code} > The 'Enter password for jdbc:hive2://localhost:1:' line continues to > print until enter is hit. From looking at the code in Commands.java (lines > 1413-1420), it's not quite clear why this happens on the second call to > readLine()) : > {code} > if (username == null) { > username = beeLine.getConsoleReader().readLine("Enter username for " + url > + ": "); > } > props.setProperty("user", username); > if (password == null) { > password = beeLine.getConsoleReader().readLine("Enter password for " + url > + ": ", > new Character('*')); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336633#comment-15336633 ] Abdullah Yousufi commented on HIVE-13964: - Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12810921/HIVE-13964.05.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10234 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_globallimit org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/130/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/130/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-130/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12810921 - PreCommit-HIVE-MASTER-Build > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch, HIVE-13964.02.patch, > HIVE-13964.03.patch, HIVE-13964.04.patch, HIVE-13964.05.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-13964: Comment: was deleted (was: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12810921/HIVE-13964.05.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10234 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_globallimit org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/130/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/130/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-130/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12810921 - PreCommit-HIVE-MASTER-Build) > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch, HIVE-13964.02.patch, > HIVE-13964.03.patch, HIVE-13964.04.patch, HIVE-13964.05.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336623#comment-15336623 ] Abdullah Yousufi commented on HIVE-13964: - Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12810921/HIVE-13964.05.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10234 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_globallimit org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/130/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/130/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-130/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12810921 - PreCommit-HIVE-MASTER-Build > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch, HIVE-13964.02.patch, > HIVE-13964.03.patch, HIVE-13964.04.patch, HIVE-13964.05.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-13964: Attachment: HIVE-13964.05.patch > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch, HIVE-13964.02.patch, > HIVE-13964.03.patch, HIVE-13964.04.patch, HIVE-13964.05.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-13964: Attachment: HIVE-13964.04.patch > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch, HIVE-13964.02.patch, > HIVE-13964.03.patch, HIVE-13964.04.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13987) Clarify current error shown when HS2 is down
[ https://issues.apache.org/jira/browse/HIVE-13987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-13987: Attachment: HIVE-13987.02.patch > Clarify current error shown when HS2 is down > > > Key: HIVE-13987 > URL: https://issues.apache.org/jira/browse/HIVE-13987 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13987.01.patch, HIVE-13987.02.patch > > > When HS2 is down and a query is run, the following error is shown in beeline: > {code} > 0: jdbc:hive2://localhost:1> show tables; > Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0) > {code} > It may be more helpful to also indicate that the reason for this is that HS2 > is down, such as: > {code} > 0: jdbc:hive2://localhost:1> show tables; > HS2 may be unavailable, check server status > Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328369#comment-15328369 ] Abdullah Yousufi commented on HIVE-13964: - So let's hold off on committing this until I resolve the NullPointerException, which occurs when the username and password are not provided in the property file. > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch, HIVE-13964.02.patch, > HIVE-13964.03.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13987) Clarify current error shown when HS2 is down
[ https://issues.apache.org/jira/browse/HIVE-13987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328218#comment-15328218 ] Abdullah Yousufi commented on HIVE-13987: - Unless I'm not understanding, the TTransportException error is also included with the fix. For example, once HS2 is killed: {code} 0: jdbc:hive2://localhost:1> show tables; HS2 may be unavailable, check server status Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0) {code} > Clarify current error shown when HS2 is down > > > Key: HIVE-13987 > URL: https://issues.apache.org/jira/browse/HIVE-13987 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13987.01.patch > > > When HS2 is down and a query is run, the following error is shown in beeline: > {code} > 0: jdbc:hive2://localhost:1> show tables; > Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0) > {code} > It may be more helpful to also indicate that the reason for this is that HS2 > is down, such as: > {code} > 0: jdbc:hive2://localhost:1> show tables; > HS2 may be unavailable, check server status > Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327937#comment-15327937 ] Abdullah Yousufi commented on HIVE-13964: - With regards to case #3, you need pass in your login credentials as well. For example: {code} ConnectionURL=jdbc:hive2://localhost:1 ConnectionUserName=hive ConnectionPassword= {code} With case #1, the property-file requires a url because that is how the !properties command works: if you run beeline and execute {code} !properties {code} you’ll see the ‘Property “url” is required error. Therefore, I don’t know if it really makes sense to combine command line options, such as -u, with the property file, as you do in case #2. What happens there is that the shell initially connects to the url specified by -u, but when the properties command is run on props, it fails and the shell exits. > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch, HIVE-13964.02.patch, > HIVE-13964.03.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-13964: Comment: was deleted (was: With regards to case #3, you need pass in your login credentials as well. For example: {code} ConnectionURL=jdbc:hive2://localhost:1 ConnectionUserName=hive ConnectionPassword= {code} With case #1, the property-file requires a url because that is how the !properties command works: if you run beeline and execute {code} !properties {code} you’ll see the ‘Property “url” is required error. Therefore, I don’t know if it really makes sense to combine command line options, such as -u, with the property file, as you do in case #2. What happens there is that the shell initially connects to the url specified by -u, but when the properties command is run on props, it fails and the shell exits.) > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch, HIVE-13964.02.patch, > HIVE-13964.03.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327938#comment-15327938 ] Abdullah Yousufi commented on HIVE-13964: - With regards to case #3, you need pass in your login credentials as well. For example: {code} ConnectionURL=jdbc:hive2://localhost:1 ConnectionUserName=hive ConnectionPassword= {code} With case #1, the property-file requires a url because that is how the !properties command works: if you run beeline and execute {code} !properties {code} you’ll see the ‘Property “url” is required error. Therefore, I don’t know if it really makes sense to combine command line options, such as -u, with the property file, as you do in case #2. What happens there is that the shell initially connects to the url specified by -u, but when the properties command is run on props, it fails and the shell exits. > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch, HIVE-13964.02.patch, > HIVE-13964.03.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13987) Clarify current error shown when HS2 is down
[ https://issues.apache.org/jira/browse/HIVE-13987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-13987: Attachment: HIVE-13987.01.patch > Clarify current error shown when HS2 is down > > > Key: HIVE-13987 > URL: https://issues.apache.org/jira/browse/HIVE-13987 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13987.01.patch > > > When HS2 is down and a query is run, the following error is shown in beeline: > {code} > 0: jdbc:hive2://localhost:1> show tables; > Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0) > {code} > It may be more helpful to also indicate that the reason for this is that HS2 > is down, such as: > {code} > 0: jdbc:hive2://localhost:1> show tables; > HS2 may be unavailable, check server status > Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13987) Clarify current error shown when HS2 is down
[ https://issues.apache.org/jira/browse/HIVE-13987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-13987: Attachment: (was: HIVE-13987.01.patch) > Clarify current error shown when HS2 is down > > > Key: HIVE-13987 > URL: https://issues.apache.org/jira/browse/HIVE-13987 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13987.01.patch > > > When HS2 is down and a query is run, the following error is shown in beeline: > {code} > 0: jdbc:hive2://localhost:1> show tables; > Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0) > {code} > It may be more helpful to also indicate that the reason for this is that HS2 > is down, such as: > {code} > 0: jdbc:hive2://localhost:1> show tables; > HS2 may be unavailable, check server status > Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15325252#comment-15325252 ] Abdullah Yousufi commented on HIVE-13964: - I attached a new patch addressing the exit error issue. You also should not get the "No such file or directory" error. > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch, HIVE-13964.02.patch, > HIVE-13964.03.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-13964: Attachment: HIVE-13964.03.patch > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch, HIVE-13964.02.patch, > HIVE-13964.03.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15325076#comment-15325076 ] Abdullah Yousufi commented on HIVE-13964: - It shouldn't be displaying that error. Could you possibly retry and see if you get the error again? > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch, HIVE-13964.02.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15324967#comment-15324967 ] Abdullah Yousufi edited comment on HIVE-13964 at 6/10/16 6:22 PM: -- I resolved the first issue and now the error is set to 1 in that case. The second issue is pretty important though, because that was the issue HIVE-6652 addressed, so it's not good if it still exists. However, I checked and it seems that's the current behavior, so my patch doesn't reintroduce that issue. This is what I get if I do this in the repo (or with my patch): {code} $ ./beeline BLA Beeline version 2.2.0-SNAPSHOT by Apache Hive beeline> {code} Note, how I don't get the 'No such file or directory' statement error. What behavior do we want here? It seems that the fix from HIVE-6652 was reverted at some point. [~xuefuz] was (Author: ayousufi): I resolved the first issue and now the error is set to 1 in that case. The second issue is pretty important though, because that was the issue HIVE-6652 addressed, so it's not good if it still exists. However, I checked and it seems that's the current behavior in upstream currently, so my patch doesn't reintroduce that issue. This is what I get if I do this in upstream (or with my patch): {code} $ ./beeline BLA Beeline version 2.2.0-SNAPSHOT by Apache Hive beeline> {code} Note, how I don't get the 'No such file or directory' statement error. What behavior do we want here? It seems that the fix from HIVE-6652 was reverted at some point. [~xuefuz] > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch, HIVE-13964.02.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15324967#comment-15324967 ] Abdullah Yousufi commented on HIVE-13964: - I resolved the first issue and now the error is set to 1 in that case. The second issue is pretty important though, because that was the issue HIVE-6652 addressed, so it's not good if it still exists. However, I checked and it seems that's the current behavior in upstream currently, so my patch doesn't reintroduce that issue. This is what I get if I do this in upstream (or with my patch): {code} $ ./beeline BLA Beeline version 2.2.0-SNAPSHOT by Apache Hive beeline> {code} Note, how I don't get the 'No such file or directory' statement error. What behavior do we want here? It seems that the fix from HIVE-6652 was reverted at some point. [~xuefuz] > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch, HIVE-13964.02.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13987) Clarify current error shown when HS2 is down
[ https://issues.apache.org/jira/browse/HIVE-13987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-13987: Attachment: HIVE-13987.01.patch > Clarify current error shown when HS2 is down > > > Key: HIVE-13987 > URL: https://issues.apache.org/jira/browse/HIVE-13987 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13987.01.patch > > > When HS2 is down and a query is run, the following error is shown in beeline: > {code} > 0: jdbc:hive2://localhost:1> show tables; > Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0) > {code} > It may be more helpful to also indicate that the reason for this is that HS2 > is down, such as: > {code} > 0: jdbc:hive2://localhost:1> show tables; > HS2 may be unavailable, check server status > Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13987) Clarify current error shown when HS2 is down
[ https://issues.apache.org/jira/browse/HIVE-13987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-13987: Status: Patch Available (was: Open) > Clarify current error shown when HS2 is down > > > Key: HIVE-13987 > URL: https://issues.apache.org/jira/browse/HIVE-13987 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > > When HS2 is down and a query is run, the following error is shown in beeline: > {code} > 0: jdbc:hive2://localhost:1> show tables; > Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0) > {code} > It may be more helpful to also indicate that the reason for this is that HS2 > is down, such as: > {code} > 0: jdbc:hive2://localhost:1> show tables; > HS2 may be unavailable, check server status > Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-13964: Attachment: HIVE-13964.02.patch > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch, HIVE-13964.02.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-13964: Attachment: (was: HIVE-13964.02.patch) > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-13964: Attachment: HIVE-13964.02.patch > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch, HIVE-13964.02.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323067#comment-15323067 ] Abdullah Yousufi edited comment on HIVE-13964 at 6/9/16 6:42 PM: - Thanks for the review, [~spena]. 1. Fixed: {code} if (propertyFile != null) { dispatch("!properties " + propertyFile); } {code} 2. That's strange, because I just tried that case and this was my output: {code} $ ./beeline --property-file Missing argument for option: property-file Usage: java org.apache.hive.cli.beeline.BeeLine -uthe JDBC URL to connect to ... {code} Could you try another parameter without any arguments, such as --hiveconf and see if it prints out the "Missing argument..." error for that? 3. Added a fix for this to exit, but can undo this if necessary. Let me know about points 2 and 3, and then I can upload another patch. was (Author: ayousufi): Thanks for the review, Sergio. 1. Fixed: {code} if (propertyFile != null) { dispatch("!properties " + propertyFile); } {code} 2. That's strange, because I just tried that case and this was my output: {code} $ ./beeline --property-file Missing argument for option: property-file Usage: java org.apache.hive.cli.beeline.BeeLine -uthe JDBC URL to connect to ... {code} Could you try another parameter without any arguments, such as --hiveconf and see if it prints out the "Missing argument..." error for that? 3. Added a fix for this to exit, but can undo this if necessary. Let me know about points 2 and 3, and then I can upload another patch. > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323067#comment-15323067 ] Abdullah Yousufi commented on HIVE-13964: - Thanks for the review, Sergio. 1. Fixed: {code} if (propertyFile != null) { dispatch("!properties " + propertyFile); } {code} 2. That's strange, because I just tried that case and this was my output: {code} $ ./beeline --property-file Missing argument for option: property-file Usage: java org.apache.hive.cli.beeline.BeeLine -uthe JDBC URL to connect to ... {code} Could you try another parameter without any arguments, such as --hiveconf and see if it prints out the "Missing argument..." error for that? 3. Added a fix for this to exit, but can undo this if necessary. Let me know about points 2 and 3, and then I can upload another patch. > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-13964: Status: Patch Available (was: Open) > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15319746#comment-15319746 ] Abdullah Yousufi edited comment on HIVE-13964 at 6/8/16 12:14 AM: -- This patch shouldn't reintroduce that error message as it existed before, unless an invalid file is passed to the --property-file parameter. Are you suggesting the error message for such a parameter be different? Thanks for the feedback! To clarify, this patch adds a new parameter that will allow passing in the property file. It adds a description of the parameter in the command line help as well. was (Author: ayousufi): This patch shouldn't reintroduce that error message as it existed before, unless an invalid file is passed to the --property-file parameter. Are you suggesting the error message for such a parameter be different? Thanks for the feedback! > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-13964: Description: HIVE-6652 removed the ability to pass in a properties file as a beeline parameter. It may be a useful feature to be able to pass the file in is a parameter, such as --property-file. (was: HIVE-6652 removed the ability to pass in a properties file as a beeline parameter. It may be a useful feature to be able to pass the file in is a parameter.) > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter, such as --property-file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15319746#comment-15319746 ] Abdullah Yousufi commented on HIVE-13964: - This patch shouldn't reintroduce that error message as it existed before, unless an invalid file is passed to the --property-file parameter. Are you suggesting the error message for such a parameter be different? Thanks for the feedback! > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-13964.01.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-13967) CREATE table fails when 'values' column name is found on the table spec.
[ https://issues.apache.org/jira/browse/HIVE-13967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi reassigned HIVE-13967: --- Assignee: Abdullah Yousufi > CREATE table fails when 'values' column name is found on the table spec. > > > Key: HIVE-13967 > URL: https://issues.apache.org/jira/browse/HIVE-13967 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > > {noformat} > hive> create table pkv (key int, values string); > > [0/4271] > FailedPredicateException(identifier,{useSQL11ReservedKeywordsForIdentifier()}?) > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:11914) > at > org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:51795) > at > org.apache.hadoop.hive.ql.parse.HiveParser.columnNameType(HiveParser.java:42051) > at > org.apache.hadoop.hive.ql.parse.HiveParser.columnNameTypeOrPKOrFK(HiveParser.java:42308) > at > org.apache.hadoop.hive.ql.parse.HiveParser.columnNameTypeOrPKOrFKList(HiveParser.java:37966) > at > org.apache.hadoop.hive.ql.parse.HiveParser.createTableStatement(HiveParser.java:5259) > at > org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:2763) > at > org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1756) > at > org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1178) > at > org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:204) > at > org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:404) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:329) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1158) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1253) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > FAILED: ParseException line 1:27 Failed to recognize predicate 'values'. > Failed rule: 'identifier' in column specification > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13964) Add a parameter to beeline to allow a properties file to be passed in
[ https://issues.apache.org/jira/browse/HIVE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdullah Yousufi updated HIVE-13964: Attachment: HIVE-13964.01.patch > Add a parameter to beeline to allow a properties file to be passed in > - > > Key: HIVE-13964 > URL: https://issues.apache.org/jira/browse/HIVE-13964 > Project: Hive > Issue Type: New Feature > Components: Beeline >Affects Versions: 2.0.1 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi >Priority: Minor > Fix For: 2.1.0 > > Attachments: HIVE-13964.01.patch > > > HIVE-6652 removed the ability to pass in a properties file as a beeline > parameter. It may be a useful feature to be able to pass the file in is a > parameter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)