[jira] [Commented] (HIVE-14919) Improve the performance of Hive on Spark 2.0.0
[ https://issues.apache.org/jira/browse/HIVE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561385#comment-15561385 ] Ferdinand Xu commented on HIVE-14919: - cc [~kellyzly] [~dapengsun] > Improve the performance of Hive on Spark 2.0.0 > -- > > Key: HIVE-14919 > URL: https://issues.apache.org/jira/browse/HIVE-14919 > Project: Hive > Issue Type: Improvement >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > Attachments: benchmark.xlsx > > > In HIVE-14029, we have updated Spark dependency to 2.0.0. We use Intel > BigBench[1] to run benchmark with Spark 2.0 over 10 GB data set comparing > with Spark 1.6. We can see quite some performance degradation for most of the > queries for BigBench. For detailed information, please see the attached file > for detailed information. This JIRA is the umbrella ticket addressing those > performance issues. > [1] https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14919) Improve the performance of Hive on Spark 2.0.0
[ https://issues.apache.org/jira/browse/HIVE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-14919: Description: In HIVE-14029, we have updated Spark dependency to 2.0.0. We use Intel BigBench[1] to run benchmark with Spark 2.0 over 10 GB data set comparing with Spark 1.6. We can see quite some performance degradation for most of the queries for BigBench. For detailed information, please see the attached file for detailed information. This JIRA is the umbrella ticket addressing those performance issues. [1] https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench was: In HIVE-14029, we have updated Spark dependency to 2.0.0. We use Intel BigBench[1] to run benchmark over 10 GB data set comparing with Spark 1.6. We can see quite some performance degradations for all the queries of BigBench. For detailed information, please see the attached files. This JIRA is the umbrella ticket addressing those performance issues. [1] https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench > Improve the performance of Hive on Spark 2.0.0 > -- > > Key: HIVE-14919 > URL: https://issues.apache.org/jira/browse/HIVE-14919 > Project: Hive > Issue Type: Improvement >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > Attachments: benchmark.xlsx > > > In HIVE-14029, we have updated Spark dependency to 2.0.0. We use Intel > BigBench[1] to run benchmark with Spark 2.0 over 10 GB data set comparing > with Spark 1.6. We can see quite some performance degradation for most of the > queries for BigBench. For detailed information, please see the attached file > for detailed information. This JIRA is the umbrella ticket addressing those > performance issues. > [1] https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14919) Improve the performance of Hive on Spark 2.0.0
[ https://issues.apache.org/jira/browse/HIVE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-14919: Attachment: benchmark.xlsx > Improve the performance of Hive on Spark 2.0.0 > -- > > Key: HIVE-14919 > URL: https://issues.apache.org/jira/browse/HIVE-14919 > Project: Hive > Issue Type: Improvement >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > Attachments: benchmark.xlsx > > > In HIVE-14029, we have updated Spark dependency to 2.0.0. We use Intel > BigBench[1] to run benchmark over 10 GB data set comparing with Spark 1.6. We > can see quite some performance degradations for all the queries of BigBench. > For detailed information, please see the attached files. This JIRA is the > umbrella ticket addressing those performance issues. > [1] https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13873) Column pruning for nested fields
[ https://issues.apache.org/jira/browse/HIVE-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561282#comment-15561282 ] Hive QA commented on HIVE-13873: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832398/HIVE-13873.1.patch {color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 612 failed/errored test(s), 10668 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_join] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_vectorization] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_vectorization_partition] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_vectorization_project] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_create_temp_table] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_1] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_3] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_4] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_disable_cbo_1] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_disable_cbo_3] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_disable_cbo_4] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_1] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_2] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_3] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_4] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_7] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_8] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_9] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join0] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join15] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join18] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join18_multi_distinct] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join20] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join27] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join30] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join31] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_filters] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_nulls] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_reordering_values] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_stats] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_without_localtask] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_smb_mapjoin_14] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_10] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_6] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_9] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ba_table3] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ba_table_union] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_groupby] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin1] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin2] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin3] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin4] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin5] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_const] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_gby] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_gby_empty] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_input26] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_join] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_limit] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join1] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_gby] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_gby_empty] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_join1] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_join] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_limit] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_lineage2] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_semijoin] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_simple_select]
[jira] [Updated] (HIVE-14917) explainanalyze_2.q fails after HIVE-14861
[ https://issues.apache.org/jira/browse/HIVE-14917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14917: --- Status: Open (was: Patch Available) > explainanalyze_2.q fails after HIVE-14861 > - > > Key: HIVE-14917 > URL: https://issues.apache.org/jira/browse/HIVE-14917 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14917.01.patch, HIVE-14917.02.patch, > HIVE-14917.03.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14917) explainanalyze_2.q fails after HIVE-14861
[ https://issues.apache.org/jira/browse/HIVE-14917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14917: --- Status: Patch Available (was: Open) > explainanalyze_2.q fails after HIVE-14861 > - > > Key: HIVE-14917 > URL: https://issues.apache.org/jira/browse/HIVE-14917 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14917.01.patch, HIVE-14917.02.patch, > HIVE-14917.03.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14917) explainanalyze_2.q fails after HIVE-14861
[ https://issues.apache.org/jira/browse/HIVE-14917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14917: --- Attachment: HIVE-14917.03.patch > explainanalyze_2.q fails after HIVE-14861 > - > > Key: HIVE-14917 > URL: https://issues.apache.org/jira/browse/HIVE-14917 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14917.01.patch, HIVE-14917.02.patch, > HIVE-14917.03.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14918) Function concat_ws get a wrong value
[ https://issues.apache.org/jira/browse/HIVE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561219#comment-15561219 ] Pengcheng Xiong commented on HIVE-14918: [~wisgood] if you like, you are always welcome to create your own UDF to treat it differently. > Function concat_ws get a wrong value > -- > > Key: HIVE-14918 > URL: https://issues.apache.org/jira/browse/HIVE-14918 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 1.1.1, 2.0.0, 2.1.0, 2.0.1 >Reporter: Xiaowei Wang >Assignee: Xiaowei Wang >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-14918.0.patch > > > FROM src INSERT OVERWRITE TABLE dest1 SELECT 'abc', 'xyz', '8675309' WHERE > src.key = 86; > SELECT concat_ws('.',NULL) FROM dest1 ; > The result is a empty string "",but I think it should be return NULL . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14815) Implement Parquet vectorization reader
[ https://issues.apache.org/jira/browse/HIVE-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561193#comment-15561193 ] Hive QA commented on HIVE-14815: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832387/HIVE-14815.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10671 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_parquet] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_parquet_types] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_parquet] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_parquet_types] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1450/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1450/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1450/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12832387 - PreCommit-HIVE-Build > Implement Parquet vectorization reader > --- > > Key: HIVE-14815 > URL: https://issues.apache.org/jira/browse/HIVE-14815 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > Attachments: HIVE-14815.patch > > > Parquet doesn't provide a vectorized reader which can be used by Hive > directly. Also for Decimal Column batch, it consists of a batch of > HiveDecimal which is a Hive type which is unknown for Parquet. To support > Hive vectorization execution engine in Hive, we have to implement the > vectorized Parquet reader in Hive side. To limit the performance impacts, we > need to implement a page level vectorized reader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14918) Function concat_ws get a wrong value
[ https://issues.apache.org/jira/browse/HIVE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561156#comment-15561156 ] Xiaowei Wang commented on HIVE-14918: - Yes,It is not a bug in MySQL .I close .Thanks! > Function concat_ws get a wrong value > -- > > Key: HIVE-14918 > URL: https://issues.apache.org/jira/browse/HIVE-14918 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 1.1.1, 2.0.0, 2.1.0, 2.0.1 >Reporter: Xiaowei Wang >Assignee: Xiaowei Wang >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-14918.0.patch > > > FROM src INSERT OVERWRITE TABLE dest1 SELECT 'abc', 'xyz', '8675309' WHERE > src.key = 86; > SELECT concat_ws('.',NULL) FROM dest1 ; > The result is a empty string "",but I think it should be return NULL . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561154#comment-15561154 ] Xuefu Zhang commented on HIVE-14797: +1 > reducer number estimating may lead to data skew > --- > > Key: HIVE-14797 > URL: https://issues.apache.org/jira/browse/HIVE-14797 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: roncenzhao >Assignee: roncenzhao > Attachments: HIVE-14797.2.patch, HIVE-14797.3.patch, HIVE-14797.patch > > > HiveKey's hash code is generated by multipling by 31 key by key which is > implemented in method `ObjectInspectorUtils.getBucketHashCode()`: > for (int i = 0; i < bucketFields.length; i++) { > int fieldHash = ObjectInspectorUtils.hashCode(bucketFields[i], > bucketFieldInspectors[i]); > hashCode = 31 * hashCode + fieldHash; > } > The follow example will lead to data skew: > I hava two table called tbl1 and tbl2 and they have the same column: a int, b > string. The values of column 'a' in both two tables are not skew, but values > of column 'b' in both two tables are skew. > When my sql is "select * from tbl1 join tbl2 on tbl1.a=tbl2.a and > tbl1.b=tbl2.b" and the estimated reducer number is 31, it will lead to data > skew. > As we know, the HiveKey's hash code is generated by `hash(a)*31 + hash(b)`. > When reducer number is 31 the reducer No. of each row is `hash(b)%31`. In the > result, the job will be skew. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14918) Function concat_ws get a wrong value
[ https://issues.apache.org/jira/browse/HIVE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561148#comment-15561148 ] Pengcheng Xiong commented on HIVE-14918: [~wisgood] i just took a look at the link. it was not marked as a bug in MySQL {code} [19 Nov 2004 14:21] Sergei Golubchik Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://www.mysql.com/documentation/ and the instructions on how to report a bug at http://bugs.mysql.com/how-to-report.php Additional info: According to the manual, CONCAT_WS skips any `NULL' values after the separator argument. Thus CONCAT_WS(' ', NULL, NULL) has zero strings to concat, and the result, quite naturally, is empty string. It does not depend on the separator: mysql> select concat('>', concat_ws('=', NULL, NULL), '<'); +--+ | concat('>', concat_ws('|', NULL, NULL), '<') | +--+ | >< | +--+ {code} > Function concat_ws get a wrong value > -- > > Key: HIVE-14918 > URL: https://issues.apache.org/jira/browse/HIVE-14918 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 1.1.1, 2.0.0, 2.1.0, 2.0.1 >Reporter: Xiaowei Wang >Assignee: Xiaowei Wang >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-14918.0.patch > > > FROM src INSERT OVERWRITE TABLE dest1 SELECT 'abc', 'xyz', '8675309' WHERE > src.key = 86; > SELECT concat_ws('.',NULL) FROM dest1 ; > The result is a empty string "",but I think it should be return NULL . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13873) Column pruning for nested fields
[ https://issues.apache.org/jira/browse/HIVE-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-13873: Attachment: HIVE-13873.1.patch Fix NPE > Column pruning for nested fields > > > Key: HIVE-13873 > URL: https://issues.apache.org/jira/browse/HIVE-13873 > Project: Hive > Issue Type: New Feature > Components: Logical Optimizer >Reporter: Xuefu Zhang >Assignee: Ferdinand Xu > Attachments: HIVE-13873.1.patch, HIVE-13873.patch, > HIVE-13873.wip.patch > > > Some columnar file formats such as Parquet store fields in struct type also > column by column using encoding described in Google Dramel pager. It's very > common in big data where data are stored in structs while queries only needs > a subset of the the fields in the structs. However, presently Hive still > needs to read the whole struct regardless whether all fields are selected. > Therefore, pruning unwanted sub-fields in struct or nested fields at file > reading time would be a big performance boost for such scenarios. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14916) Reduce the memory requirements for Spark tests
[ https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dapeng Sun updated HIVE-14916: -- Attachment: HIVE-14916.002.patch > Reduce the memory requirements for Spark tests > -- > > Key: HIVE-14916 > URL: https://issues.apache.org/jira/browse/HIVE-14916 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Dapeng Sun > Attachments: HIVE-14916.001.patch, HIVE-14916.002.patch > > > As HIVE-14887, we need to reduce the memory requirements for Spark tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14799) Query operation are not thread safe during its cancellation
[ https://issues.apache.org/jira/browse/HIVE-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561106#comment-15561106 ] Hive QA commented on HIVE-14799: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832386/HIVE-14799.4.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10663 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1449/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1449/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1449/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12832386 - PreCommit-HIVE-Build > Query operation are not thread safe during its cancellation > --- > > Key: HIVE-14799 > URL: https://issues.apache.org/jira/browse/HIVE-14799 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-14799.1.patch, HIVE-14799.2.patch, > HIVE-14799.3.patch, HIVE-14799.4.patch, HIVE-14799.patch > > > When a query is cancelled either via Beeline (Ctrl-C) or API call > TCLIService.Client.CancelOperation, SQLOperation.cancel is invoked in a > different thread from that running the query to close/destroy its > encapsulated Driver object. Both SQLOperation and Driver are not thread-safe > which could sometimes result in Runtime exceptions like NPE. The errors from > the running query are not handled properly therefore probably causing some > stuffs (files, locks etc) not being cleaned after the query termination. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14918) Function concat_ws get a wrong value
[ https://issues.apache.org/jira/browse/HIVE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561056#comment-15561056 ] Xiaowei Wang commented on HIVE-14918: - It is true that concat_ws('.',NULL) of MySQL return empty. https://bugs.mysql.com/bug.php?id=6719 But I and most colleagues of mine don't understand. Regardless of MySQL aside, which do you think is more reasonable ? Thanks for your explanation. > Function concat_ws get a wrong value > -- > > Key: HIVE-14918 > URL: https://issues.apache.org/jira/browse/HIVE-14918 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 1.1.1, 2.0.0, 2.1.0, 2.0.1 >Reporter: Xiaowei Wang >Assignee: Xiaowei Wang >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-14918.0.patch > > > FROM src INSERT OVERWRITE TABLE dest1 SELECT 'abc', 'xyz', '8675309' WHERE > src.key = 86; > SELECT concat_ws('.',NULL) FROM dest1 ; > The result is a empty string "",but I think it should be return NULL . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14799) Query operation are not thread safe during its cancellation
[ https://issues.apache.org/jira/browse/HIVE-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561022#comment-15561022 ] Hive QA commented on HIVE-14799: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832386/HIVE-14799.4.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10663 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching org.apache.hive.spark.client.TestSparkClient.testJobSubmission {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1448/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1448/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1448/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12832386 - PreCommit-HIVE-Build > Query operation are not thread safe during its cancellation > --- > > Key: HIVE-14799 > URL: https://issues.apache.org/jira/browse/HIVE-14799 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-14799.1.patch, HIVE-14799.2.patch, > HIVE-14799.3.patch, HIVE-14799.4.patch, HIVE-14799.patch > > > When a query is cancelled either via Beeline (Ctrl-C) or API call > TCLIService.Client.CancelOperation, SQLOperation.cancel is invoked in a > different thread from that running the query to close/destroy its > encapsulated Driver object. Both SQLOperation and Driver are not thread-safe > which could sometimes result in Runtime exceptions like NPE. The errors from > the running query are not handled properly therefore probably causing some > stuffs (files, locks etc) not being cleaned after the query termination. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13589) beeline - support prompt for password with '-u' option
[ https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15560992#comment-15560992 ] Ferdinand Xu commented on HIVE-13589: - I left some comments on review board. Also please attach your patch to trigger the precommit. Thanks! > beeline - support prompt for password with '-u' option > -- > > Key: HIVE-13589 > URL: https://issues.apache.org/jira/browse/HIVE-13589 > Project: Hive > Issue Type: Bug > Components: Beeline >Reporter: Thejas M Nair >Assignee: Vihang Karajgaonkar > Fix For: 2.2.0 > > Attachments: HIVE-13589.1.patch, HIVE-13589.2.patch, > HIVE-13589.3.patch, HIVE-13589.4.patch, HIVE-13589.5.patch, > HIVE-13589.6.patch, HIVE-13589.7.patch, HIVE-13589.8.patch, HIVE-13589.9.patch > > > Specifying connection string using commandline options in beeline is > convenient, as it gets saved in shell command history, and it is easy to > retrieve it from there. > However, specifying the password in command prompt is not secure as it gets > displayed on screen and saved in the history. > It should be possible to specify '-p' without an argument to make beeline > prompt for password. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14815) Implement Parquet vectorization reader
[ https://issues.apache.org/jira/browse/HIVE-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-14815: Status: Patch Available (was: Open) > Implement Parquet vectorization reader > --- > > Key: HIVE-14815 > URL: https://issues.apache.org/jira/browse/HIVE-14815 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > Attachments: HIVE-14815.patch > > > Parquet doesn't provide a vectorized reader which can be used by Hive > directly. Also for Decimal Column batch, it consists of a batch of > HiveDecimal which is a Hive type which is unknown for Parquet. To support > Hive vectorization execution engine in Hive, we have to implement the > vectorized Parquet reader in Hive side. To limit the performance impacts, we > need to implement a page level vectorized reader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14815) Implement Parquet vectorization reader
[ https://issues.apache.org/jira/browse/HIVE-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-14815: Attachment: HIVE-14815.patch > Implement Parquet vectorization reader > --- > > Key: HIVE-14815 > URL: https://issues.apache.org/jira/browse/HIVE-14815 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > Attachments: HIVE-14815.patch > > > Parquet doesn't provide a vectorized reader which can be used by Hive > directly. Also for Decimal Column batch, it consists of a batch of > HiveDecimal which is a Hive type which is unknown for Parquet. To support > Hive vectorization execution engine in Hive, we have to implement the > vectorized Parquet reader in Hive side. To limit the performance impacts, we > need to implement a page level vectorized reader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14913) Add new unit tests
[ https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15560939#comment-15560939 ] Hive QA commented on HIVE-14913: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832382/HIVE-14913.3.patch {color:green}SUCCESS:{color} +1 due to 11 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10663 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk] org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[alter_merge_orc] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_0] org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching org.apache.hive.spark.client.TestSparkClient.testJobSubmission {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1447/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1447/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1447/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12832382 - PreCommit-HIVE-Build > Add new unit tests > -- > > Key: HIVE-14913 > URL: https://issues.apache.org/jira/browse/HIVE-14913 > Project: Hive > Issue Type: Task > Components: Tests >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, > HIVE-14913.3.patch > > > Moving bunch of tests from system test to hive unit tests to reduce testing > overhead -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14799) Query operation are not thread safe during its cancellation
[ https://issues.apache.org/jira/browse/HIVE-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-14799: --- Attachment: HIVE-14799.4.patch > Query operation are not thread safe during its cancellation > --- > > Key: HIVE-14799 > URL: https://issues.apache.org/jira/browse/HIVE-14799 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-14799.1.patch, HIVE-14799.2.patch, > HIVE-14799.3.patch, HIVE-14799.4.patch, HIVE-14799.patch > > > When a query is cancelled either via Beeline (Ctrl-C) or API call > TCLIService.Client.CancelOperation, SQLOperation.cancel is invoked in a > different thread from that running the query to close/destroy its > encapsulated Driver object. Both SQLOperation and Driver are not thread-safe > which could sometimes result in Runtime exceptions like NPE. The errors from > the running query are not handled properly therefore probably causing some > stuffs (files, locks etc) not being cleaned after the query termination. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14799) Query operation are not thread safe during its cancellation
[ https://issues.apache.org/jira/browse/HIVE-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-14799: --- Attachment: (was: HIVE-14799.4.patch) > Query operation are not thread safe during its cancellation > --- > > Key: HIVE-14799 > URL: https://issues.apache.org/jira/browse/HIVE-14799 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-14799.1.patch, HIVE-14799.2.patch, > HIVE-14799.3.patch, HIVE-14799.4.patch, HIVE-14799.patch > > > When a query is cancelled either via Beeline (Ctrl-C) or API call > TCLIService.Client.CancelOperation, SQLOperation.cancel is invoked in a > different thread from that running the query to close/destroy its > encapsulated Driver object. Both SQLOperation and Driver are not thread-safe > which could sometimes result in Runtime exceptions like NPE. The errors from > the running query are not handled properly therefore probably causing some > stuffs (files, locks etc) not being cleaned after the query termination. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14799) Query operation are not thread safe during its cancellation
[ https://issues.apache.org/jira/browse/HIVE-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-14799: --- Attachment: HIVE-14799.4.patch The tests failed in my local env even without the patch, so I wonder if the patch is related. Reattach for another run > Query operation are not thread safe during its cancellation > --- > > Key: HIVE-14799 > URL: https://issues.apache.org/jira/browse/HIVE-14799 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-14799.1.patch, HIVE-14799.2.patch, > HIVE-14799.3.patch, HIVE-14799.4.patch, HIVE-14799.patch > > > When a query is cancelled either via Beeline (Ctrl-C) or API call > TCLIService.Client.CancelOperation, SQLOperation.cancel is invoked in a > different thread from that running the query to close/destroy its > encapsulated Driver object. Both SQLOperation and Driver are not thread-safe > which could sometimes result in Runtime exceptions like NPE. The errors from > the running query are not handled properly therefore probably causing some > stuffs (files, locks etc) not being cleaned after the query termination. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14917) explainanalyze_2.q fails after HIVE-14861
[ https://issues.apache.org/jira/browse/HIVE-14917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15560859#comment-15560859 ] Hive QA commented on HIVE-14917: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832380/HIVE-14917.02.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10663 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1446/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1446/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1446/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12832380 - PreCommit-HIVE-Build > explainanalyze_2.q fails after HIVE-14861 > - > > Key: HIVE-14917 > URL: https://issues.apache.org/jira/browse/HIVE-14917 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14917.01.patch, HIVE-14917.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14803) S3: Stats gathering for insert queries can be expensive for partitioned dataset
[ https://issues.apache.org/jira/browse/HIVE-14803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15560855#comment-15560855 ] Pengcheng Xiong commented on HIVE-14803: LGTM +1, thanks for the patch! > S3: Stats gathering for insert queries can be expensive for partitioned > dataset > --- > > Key: HIVE-14803 > URL: https://issues.apache.org/jira/browse/HIVE-14803 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-14803.1.patch > > > StatsTask's aggregateStats populates stats details for all partitions by > checking the file sizes which turns out to be expensive when larger number of > partitions are inserted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14358) Add metrics for number of queries executed for each execution engine (mr, spark, tez)
[ https://issues.apache.org/jira/browse/HIVE-14358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15560851#comment-15560851 ] Lefty Leverenz commented on HIVE-14358: --- [~zsombor.klara], will you have time to work on the metrics documentation? Or should I create a new JIRA issue for documenting metrics? cc: [~szehon] > Add metrics for number of queries executed for each execution engine (mr, > spark, tez) > - > > Key: HIVE-14358 > URL: https://issues.apache.org/jira/browse/HIVE-14358 > Project: Hive > Issue Type: Task > Components: HiveServer2 >Affects Versions: 2.1.0 >Reporter: Lenni Kuff >Assignee: Barna Zsombor Klara > Fix For: 2.2.0 > > Attachments: HIVE-14358.patch > > > HiveServer2 currently has a metric for the total number of queries ran since > last restart, but it would be useful to also have metrics for number of > queries ran for each execution engine. This would improve supportability by > allowing users to get a high-level understanding of what workloads had been > running on the server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14918) Function concat_ws get a wrong value
[ https://issues.apache.org/jira/browse/HIVE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15560808#comment-15560808 ] Pengcheng Xiong commented on HIVE-14918: Initially when it was developed, it was following MySQL. {code} mysql> SELECT concat_ws('.',NULL) FROM (SELECT 'abc', 'xyz', '8675309' from t WHERE t.test = 86)subq; +-+ | concat_ws('.',NULL) | +-+ | | +-+ 1 row in set (0.13 sec) {code} > Function concat_ws get a wrong value > -- > > Key: HIVE-14918 > URL: https://issues.apache.org/jira/browse/HIVE-14918 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 1.1.1, 2.0.0, 2.1.0, 2.0.1 >Reporter: Xiaowei Wang >Assignee: Xiaowei Wang >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-14918.0.patch > > > FROM src INSERT OVERWRITE TABLE dest1 SELECT 'abc', 'xyz', '8675309' WHERE > src.key = 86; > SELECT concat_ws('.',NULL) FROM dest1 ; > The result is a empty string "",but I think it should be return NULL . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14918) Function concat_ws get a wrong value
[ https://issues.apache.org/jira/browse/HIVE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15560809#comment-15560809 ] Pengcheng Xiong commented on HIVE-14918: Initially when it was developed, it was following MySQL. {code} mysql> SELECT concat_ws('.',NULL) FROM (SELECT 'abc', 'xyz', '8675309' from t WHERE t.test = 86)subq; +-+ | concat_ws('.',NULL) | +-+ | | +-+ 1 row in set (0.13 sec) {code} > Function concat_ws get a wrong value > -- > > Key: HIVE-14918 > URL: https://issues.apache.org/jira/browse/HIVE-14918 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 1.1.1, 2.0.0, 2.1.0, 2.0.1 >Reporter: Xiaowei Wang >Assignee: Xiaowei Wang >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-14918.0.patch > > > FROM src INSERT OVERWRITE TABLE dest1 SELECT 'abc', 'xyz', '8675309' WHERE > src.key = 86; > SELECT concat_ws('.',NULL) FROM dest1 ; > The result is a empty string "",but I think it should be return NULL . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14803) S3: Stats gathering for insert queries can be expensive for partitioned dataset
[ https://issues.apache.org/jira/browse/HIVE-14803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15560780#comment-15560780 ] Rajesh Balamohan commented on HIVE-14803: - Thanks [~pxiong]. RB link. https://reviews.apache.org/r/52670/ > S3: Stats gathering for insert queries can be expensive for partitioned > dataset > --- > > Key: HIVE-14803 > URL: https://issues.apache.org/jira/browse/HIVE-14803 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-14803.1.patch > > > StatsTask's aggregateStats populates stats details for all partitions by > checking the file sizes which turns out to be expensive when larger number of > partitions are inserted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14799) Query operation are not thread safe during its cancellation
[ https://issues.apache.org/jira/browse/HIVE-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15560773#comment-15560773 ] Hive QA commented on HIVE-14799: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832379/HIVE-14799.3.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10663 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1445/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1445/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1445/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12832379 - PreCommit-HIVE-Build > Query operation are not thread safe during its cancellation > --- > > Key: HIVE-14799 > URL: https://issues.apache.org/jira/browse/HIVE-14799 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-14799.1.patch, HIVE-14799.2.patch, > HIVE-14799.3.patch, HIVE-14799.patch > > > When a query is cancelled either via Beeline (Ctrl-C) or API call > TCLIService.Client.CancelOperation, SQLOperation.cancel is invoked in a > different thread from that running the query to close/destroy its > encapsulated Driver object. Both SQLOperation and Driver are not thread-safe > which could sometimes result in Runtime exceptions like NPE. The errors from > the running query are not handled properly therefore probably causing some > stuffs (files, locks etc) not being cleaned after the query termination. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14918) Function concat_ws get a wrong value
[ https://issues.apache.org/jira/browse/HIVE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15560739#comment-15560739 ] Xiaowei Wang commented on HIVE-14918: - I mean, concat_ws('.',NULL) should return NULL not a empty string "" .What do you think? > Function concat_ws get a wrong value > -- > > Key: HIVE-14918 > URL: https://issues.apache.org/jira/browse/HIVE-14918 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 1.1.1, 2.0.0, 2.1.0, 2.0.1 >Reporter: Xiaowei Wang >Assignee: Xiaowei Wang >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-14918.0.patch > > > FROM src INSERT OVERWRITE TABLE dest1 SELECT 'abc', 'xyz', '8675309' WHERE > src.key = 86; > SELECT concat_ws('.',NULL) FROM dest1 ; > The result is a empty string "",but I think it should be return NULL . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14913) Add new unit tests
[ https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-14913: --- Status: Patch Available (was: Open) Took care of test failures > Add new unit tests > -- > > Key: HIVE-14913 > URL: https://issues.apache.org/jira/browse/HIVE-14913 > Project: Hive > Issue Type: Task > Components: Tests >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, > HIVE-14913.3.patch > > > Moving bunch of tests from system test to hive unit tests to reduce testing > overhead -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14913) Add new unit tests
[ https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-14913: --- Attachment: HIVE-14913.3.patch > Add new unit tests > -- > > Key: HIVE-14913 > URL: https://issues.apache.org/jira/browse/HIVE-14913 > Project: Hive > Issue Type: Task > Components: Tests >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, > HIVE-14913.3.patch > > > Moving bunch of tests from system test to hive unit tests to reduce testing > overhead -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15560692#comment-15560692 ] Hive QA commented on HIVE-11394: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832378/HIVE-11394.07.patch {color:green}SUCCESS:{color} +1 due to 126 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10633 tests executed *Failed tests:* {noformat} TestMiniLlapLocalCliDriver-orc_llap.q-union5.q-delete_where_non_partitioned.q-and-27-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_udf] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_udf1] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_udf] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] org.apache.hadoop.hive.ql.exec.vector.TestVectorSelectOperator.testSelectOperator org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1444/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1444/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1444/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12832378 - PreCommit-HIVE-Build > Enhance EXPLAIN display for vectorization > - > > Key: HIVE-11394 > URL: https://issues.apache.org/jira/browse/HIVE-11394 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, > HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, > HIVE-11394.06.patch, HIVE-11394.07.patch > > > Add detail to the EXPLAIN output showing why a Map and Reduce work is not > vectorized. > New syntax is: EXPLAIN VECTORIZATION \[ONLY\] \[SUMMARY|DETAIL\] > The ONLY option suppresses most non-vectorization elements. > SUMMARY shows vectorization information for the PLAN (is vectorization > enabled) and a summary of Map and Reduce work. > The optional clause defaults are not ONLY and SUMMARY. > Here are some examples: > EXPLAIN VECTORIZATION example: > (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization > sections) > It is the same as EXPLAIN VECTORIZATION SUMMARY. > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > … > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > … > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: decimal_date_test > Statistics: Num rows: 12288 Data size: 2467616 Basic stats: > COMPLETE Column stats: NONE > Filter Operator > predicate: cdate BETWEEN 1969-12-30 AND 1970-01-02 (type: > boolean) > Statistics: Num rows: 6144 Data size: 1233808 Basic > stats: COMPLETE Column stats: NONE > Select Operator > expressions: cdate (type: date) > outputColumnNames: _col0 > Statistics: Num rows: 6144 Data size: 1233808 Basic > stats: COMPLETE Column stats: NONE > Reduce Output Operator > key expressions: _col0 (type: date) > sort order: + > Statistics: Num rows: 6144 Data size: 1233808 Basic > stats: COMPLETE Column stats: NONE > Execution mode: vectorized, llap > LLAP IO: all inputs > Map Vectorization: > enabled: true > enabledConditionsMet: > hive.vectorized.use.vectorized.input.format IS true > groupByVectorOutput: true > inputFileFormats: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > allNative: false > usesVectorUDFAdaptor: false > vectorized: true > Reducer 2 > Execution mode: vectorized, llap > Reduce Vectorization: > enabled: true > enableConditionsMet:
[jira] [Updated] (HIVE-14917) explainanalyze_2.q fails after HIVE-14861
[ https://issues.apache.org/jira/browse/HIVE-14917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14917: --- Status: Patch Available (was: Open) > explainanalyze_2.q fails after HIVE-14861 > - > > Key: HIVE-14917 > URL: https://issues.apache.org/jira/browse/HIVE-14917 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14917.01.patch, HIVE-14917.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14917) explainanalyze_2.q fails after HIVE-14861
[ https://issues.apache.org/jira/browse/HIVE-14917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14917: --- Attachment: HIVE-14917.02.patch > explainanalyze_2.q fails after HIVE-14861 > - > > Key: HIVE-14917 > URL: https://issues.apache.org/jira/browse/HIVE-14917 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14917.01.patch, HIVE-14917.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14917) explainanalyze_2.q fails after HIVE-14861
[ https://issues.apache.org/jira/browse/HIVE-14917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14917: --- Status: Open (was: Patch Available) > explainanalyze_2.q fails after HIVE-14861 > - > > Key: HIVE-14917 > URL: https://issues.apache.org/jira/browse/HIVE-14917 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14917.01.patch, HIVE-14917.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14799) Query operation are not thread safe during its cancellation
[ https://issues.apache.org/jira/browse/HIVE-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-14799: --- Attachment: HIVE-14799.3.patch Revised the patch and use the model [~sershe] suggestged. The close operation will defer the resource releases to the query process if the driver is running (compiling/executing) the query. The resources get released once the query is finished (or interrupted). Otherwise the close releases the driver resource by itself. So there will be no waiting for the close (or cancel) operation. [~sershe] Could you review it? I have also uploaded the new patch to RB https://reviews.apache.org/r/52559/ > Query operation are not thread safe during its cancellation > --- > > Key: HIVE-14799 > URL: https://issues.apache.org/jira/browse/HIVE-14799 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-14799.1.patch, HIVE-14799.2.patch, > HIVE-14799.3.patch, HIVE-14799.patch > > > When a query is cancelled either via Beeline (Ctrl-C) or API call > TCLIService.Client.CancelOperation, SQLOperation.cancel is invoked in a > different thread from that running the query to close/destroy its > encapsulated Driver object. Both SQLOperation and Driver are not thread-safe > which could sometimes result in Runtime exceptions like NPE. The errors from > the running query are not handled properly therefore probably causing some > stuffs (files, locks etc) not being cleaned after the query termination. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14913) Add new unit tests
[ https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-14913: --- Status: Open (was: Patch Available) > Add new unit tests > -- > > Key: HIVE-14913 > URL: https://issues.apache.org/jira/browse/HIVE-14913 > Project: Hive > Issue Type: Task > Components: Tests >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch > > > Moving bunch of tests from system test to hive unit tests to reduce testing > overhead -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11394: Attachment: HIVE-11394.07.patch > Enhance EXPLAIN display for vectorization > - > > Key: HIVE-11394 > URL: https://issues.apache.org/jira/browse/HIVE-11394 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, > HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, > HIVE-11394.06.patch, HIVE-11394.07.patch > > > Add detail to the EXPLAIN output showing why a Map and Reduce work is not > vectorized. > New syntax is: EXPLAIN VECTORIZATION \[ONLY\] \[SUMMARY|DETAIL\] > The ONLY option suppresses most non-vectorization elements. > SUMMARY shows vectorization information for the PLAN (is vectorization > enabled) and a summary of Map and Reduce work. > The optional clause defaults are not ONLY and SUMMARY. > Here are some examples: > EXPLAIN VECTORIZATION example: > (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization > sections) > It is the same as EXPLAIN VECTORIZATION SUMMARY. > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > … > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > … > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: decimal_date_test > Statistics: Num rows: 12288 Data size: 2467616 Basic stats: > COMPLETE Column stats: NONE > Filter Operator > predicate: cdate BETWEEN 1969-12-30 AND 1970-01-02 (type: > boolean) > Statistics: Num rows: 6144 Data size: 1233808 Basic > stats: COMPLETE Column stats: NONE > Select Operator > expressions: cdate (type: date) > outputColumnNames: _col0 > Statistics: Num rows: 6144 Data size: 1233808 Basic > stats: COMPLETE Column stats: NONE > Reduce Output Operator > key expressions: _col0 (type: date) > sort order: + > Statistics: Num rows: 6144 Data size: 1233808 Basic > stats: COMPLETE Column stats: NONE > Execution mode: vectorized, llap > LLAP IO: all inputs > Map Vectorization: > enabled: true > enabledConditionsMet: > hive.vectorized.use.vectorized.input.format IS true > groupByVectorOutput: true > inputFileFormats: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > allNative: false > usesVectorUDFAdaptor: false > vectorized: true > Reducer 2 > Execution mode: vectorized, llap > Reduce Vectorization: > enabled: true > enableConditionsMet: hive.vectorized.execution.reduce.enabled > IS true, hive.execution.engine tez IN [tez, spark] IS true > groupByVectorOutput: true > allNative: false > usesVectorUDFAdaptor: false > vectorized: true > Reduce Operator Tree: > Select Operator > expressions: KEY.reducesinkkey0 (type: date) > outputColumnNames: _col0 > Statistics: Num rows: 6144 Data size: 1233808 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 6144 Data size: 1233808 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.SequenceFileInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > {code} > EXPLAIN VECTORIZATION DETAIL > (Note the added Select Vectorization, Group By Vectorization, Reduce Sink > Vectorization sections in this example) > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > … > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Reducer 3 <- Reducer 2 (SIMPLE_EDGE) > … > Vertices: > Map 1 > Map
[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11394: Status: In Progress (was: Patch Available) > Enhance EXPLAIN display for vectorization > - > > Key: HIVE-11394 > URL: https://issues.apache.org/jira/browse/HIVE-11394 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, > HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, > HIVE-11394.06.patch, HIVE-11394.07.patch > > > Add detail to the EXPLAIN output showing why a Map and Reduce work is not > vectorized. > New syntax is: EXPLAIN VECTORIZATION \[ONLY\] \[SUMMARY|DETAIL\] > The ONLY option suppresses most non-vectorization elements. > SUMMARY shows vectorization information for the PLAN (is vectorization > enabled) and a summary of Map and Reduce work. > The optional clause defaults are not ONLY and SUMMARY. > Here are some examples: > EXPLAIN VECTORIZATION example: > (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization > sections) > It is the same as EXPLAIN VECTORIZATION SUMMARY. > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > … > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > … > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: decimal_date_test > Statistics: Num rows: 12288 Data size: 2467616 Basic stats: > COMPLETE Column stats: NONE > Filter Operator > predicate: cdate BETWEEN 1969-12-30 AND 1970-01-02 (type: > boolean) > Statistics: Num rows: 6144 Data size: 1233808 Basic > stats: COMPLETE Column stats: NONE > Select Operator > expressions: cdate (type: date) > outputColumnNames: _col0 > Statistics: Num rows: 6144 Data size: 1233808 Basic > stats: COMPLETE Column stats: NONE > Reduce Output Operator > key expressions: _col0 (type: date) > sort order: + > Statistics: Num rows: 6144 Data size: 1233808 Basic > stats: COMPLETE Column stats: NONE > Execution mode: vectorized, llap > LLAP IO: all inputs > Map Vectorization: > enabled: true > enabledConditionsMet: > hive.vectorized.use.vectorized.input.format IS true > groupByVectorOutput: true > inputFileFormats: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > allNative: false > usesVectorUDFAdaptor: false > vectorized: true > Reducer 2 > Execution mode: vectorized, llap > Reduce Vectorization: > enabled: true > enableConditionsMet: hive.vectorized.execution.reduce.enabled > IS true, hive.execution.engine tez IN [tez, spark] IS true > groupByVectorOutput: true > allNative: false > usesVectorUDFAdaptor: false > vectorized: true > Reduce Operator Tree: > Select Operator > expressions: KEY.reducesinkkey0 (type: date) > outputColumnNames: _col0 > Statistics: Num rows: 6144 Data size: 1233808 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 6144 Data size: 1233808 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.SequenceFileInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > {code} > EXPLAIN VECTORIZATION DETAIL > (Note the added Select Vectorization, Group By Vectorization, Reduce Sink > Vectorization sections in this example) > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > … > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Reducer 3 <- Reducer 2 (SIMPLE_EDGE) > … > Vertices: > Map 1 >
[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11394: Status: Patch Available (was: In Progress) > Enhance EXPLAIN display for vectorization > - > > Key: HIVE-11394 > URL: https://issues.apache.org/jira/browse/HIVE-11394 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, > HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, > HIVE-11394.06.patch, HIVE-11394.07.patch > > > Add detail to the EXPLAIN output showing why a Map and Reduce work is not > vectorized. > New syntax is: EXPLAIN VECTORIZATION \[ONLY\] \[SUMMARY|DETAIL\] > The ONLY option suppresses most non-vectorization elements. > SUMMARY shows vectorization information for the PLAN (is vectorization > enabled) and a summary of Map and Reduce work. > The optional clause defaults are not ONLY and SUMMARY. > Here are some examples: > EXPLAIN VECTORIZATION example: > (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization > sections) > It is the same as EXPLAIN VECTORIZATION SUMMARY. > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > … > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > … > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: decimal_date_test > Statistics: Num rows: 12288 Data size: 2467616 Basic stats: > COMPLETE Column stats: NONE > Filter Operator > predicate: cdate BETWEEN 1969-12-30 AND 1970-01-02 (type: > boolean) > Statistics: Num rows: 6144 Data size: 1233808 Basic > stats: COMPLETE Column stats: NONE > Select Operator > expressions: cdate (type: date) > outputColumnNames: _col0 > Statistics: Num rows: 6144 Data size: 1233808 Basic > stats: COMPLETE Column stats: NONE > Reduce Output Operator > key expressions: _col0 (type: date) > sort order: + > Statistics: Num rows: 6144 Data size: 1233808 Basic > stats: COMPLETE Column stats: NONE > Execution mode: vectorized, llap > LLAP IO: all inputs > Map Vectorization: > enabled: true > enabledConditionsMet: > hive.vectorized.use.vectorized.input.format IS true > groupByVectorOutput: true > inputFileFormats: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > allNative: false > usesVectorUDFAdaptor: false > vectorized: true > Reducer 2 > Execution mode: vectorized, llap > Reduce Vectorization: > enabled: true > enableConditionsMet: hive.vectorized.execution.reduce.enabled > IS true, hive.execution.engine tez IN [tez, spark] IS true > groupByVectorOutput: true > allNative: false > usesVectorUDFAdaptor: false > vectorized: true > Reduce Operator Tree: > Select Operator > expressions: KEY.reducesinkkey0 (type: date) > outputColumnNames: _col0 > Statistics: Num rows: 6144 Data size: 1233808 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 6144 Data size: 1233808 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.SequenceFileInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > {code} > EXPLAIN VECTORIZATION DETAIL > (Note the added Select Vectorization, Group By Vectorization, Reduce Sink > Vectorization sections in this example) > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > … > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Reducer 3 <- Reducer 2 (SIMPLE_EDGE) > … > Vertices: > Map 1 >
[jira] [Commented] (HIVE-14917) explainanalyze_2.q fails after HIVE-14861
[ https://issues.apache.org/jira/browse/HIVE-14917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15560409#comment-15560409 ] Hive QA commented on HIVE-14917: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832370/HIVE-14917.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10663 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1443/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1443/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1443/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12832370 - PreCommit-HIVE-Build > explainanalyze_2.q fails after HIVE-14861 > - > > Key: HIVE-14917 > URL: https://issues.apache.org/jira/browse/HIVE-14917 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14917.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14918) Function concat_ws get a wrong value
[ https://issues.apache.org/jira/browse/HIVE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15560316#comment-15560316 ] Pengcheng Xiong commented on HIVE-14918: [~wisgood], i tried on current master, it is working fine. If you have problem on 2.1, maybe we should backport some patches. {code} hive> create table d (c1 string, c2 string, c3 string); OK hive> FROM src INSERT OVERWRITE TABLE d SELECT 'abc', 'xyz', '8675309' WHERE src.key = 86; Query ID = pxiong_20161009101753_605845db-aeb9-4dab-87d6-5ad51fab1f79 Total jobs = 1 hive> select * from d; OK abc xyz 8675309 Time taken: 0.127 seconds, Fetched: 1 row(s) hive> SELECT concat_ws('.',NULL) FROM d; OK Time taken: 0.096 seconds, Fetched: 1 row(s) {code} And also, a rewritten query also works fine {code} hive> SELECT concat_ws('.',NULL) FROM (SELECT 'abc', 'xyz', '8675309' from src WHERE src.key = 86)subq; OK Time taken: 0.272 seconds, Fetched: 1 row(s) {code} > Function concat_ws get a wrong value > -- > > Key: HIVE-14918 > URL: https://issues.apache.org/jira/browse/HIVE-14918 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 1.1.1, 2.0.0, 2.1.0, 2.0.1 >Reporter: Xiaowei Wang >Assignee: Xiaowei Wang >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-14918.0.patch > > > FROM src INSERT OVERWRITE TABLE dest1 SELECT 'abc', 'xyz', '8675309' WHERE > src.key = 86; > SELECT concat_ws('.',NULL) FROM dest1 ; > The result is a empty string "",but I think it should be return NULL . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14917) explainanalyze_2.q fails after HIVE-14861
[ https://issues.apache.org/jira/browse/HIVE-14917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14917: --- Status: Open (was: Patch Available) > explainanalyze_2.q fails after HIVE-14861 > - > > Key: HIVE-14917 > URL: https://issues.apache.org/jira/browse/HIVE-14917 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14917.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14917) explainanalyze_2.q fails after HIVE-14861
[ https://issues.apache.org/jira/browse/HIVE-14917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14917: --- Status: Patch Available (was: Open) > explainanalyze_2.q fails after HIVE-14861 > - > > Key: HIVE-14917 > URL: https://issues.apache.org/jira/browse/HIVE-14917 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14917.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14917) explainanalyze_2.q fails after HIVE-14861
[ https://issues.apache.org/jira/browse/HIVE-14917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14917: --- Attachment: HIVE-14917.01.patch > explainanalyze_2.q fails after HIVE-14861 > - > > Key: HIVE-14917 > URL: https://issues.apache.org/jira/browse/HIVE-14917 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14917.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14917) explainanalyze_2.q fails after HIVE-14861
[ https://issues.apache.org/jira/browse/HIVE-14917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14917: --- Attachment: (was: HIVE-14917.01.patch) > explainanalyze_2.q fails after HIVE-14861 > - > > Key: HIVE-14917 > URL: https://issues.apache.org/jira/browse/HIVE-14917 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14662) Wrong Class Instance When Using Custom SERDE
[ https://issues.apache.org/jira/browse/HIVE-14662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559867#comment-15559867 ] Hive QA commented on HIVE-14662: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832341/HIVE-14662.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10663 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testTaskStatus {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1442/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1442/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1442/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12832341 - PreCommit-HIVE-Build > Wrong Class Instance When Using Custom SERDE > > > Key: HIVE-14662 > URL: https://issues.apache.org/jira/browse/HIVE-14662 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Reporter: Nemon Lou >Assignee: Nemon Lou > Attachments: HIVE-14662.patch > > > Using [SERDE for > mongoDB|https://github.com/mongodb/mongo-hadoop/blob/master/hive/src/main/java/com/mongodb/hadoop/hive/BSONSerDe.java] > DDL > {noformat} > create external table mytable (ID STRING..) > ROW FORMAT SERDE 'com.mongodb.hadoop.hive.BSONSerDe' > WITH SERDEPROPERTIES('mongo.columns.mapping'='{"ID":"_id",.. }') > STORED AS INPUTFORMAT 'com.mongodb.hadoop.mapred.BSONFileInputFormat' > OUTPUTFORMAT 'com.mongodb.hadoop.hive.output.HiveBSONFileOutputFormat' > LOCATION 'hdfs:///mypath'; > {noformat} > Open beeline and run the following query ,and then open another beeline,run > this again.Then fails. > {noformat} > add jar hdfs:///tmp/mongo-hadoop-hive-1.4.2_new.jar; > add jar hdfs:///tmp/mongo-java-driver-3.0.4.jar; > add jar hdfs:///tmp/mongo-hadoop-core-1.4.2_new.jar; > select * from mytable limit 1; > {noformat} > Error log : > {noformat} > 2016-08-25 09:30:34,475 | WARN | HiveServer2-Handler-Pool: Thread-11972 | > Error fetching results: | > org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:1058) > org.apache.hive.service.cli.HiveSQLException: java.io.IOException: > org.apache.hadoop.hive.serde2.SerDeException: class > com.mongodb.hadoop.hive.BSONSerDerequires a BSONWritable object, notclass > com.mongodb.hadoop.io.BSONWritable > at > org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:366) > at > org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:251) > at > org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:710) > at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) > at > org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) > at > org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1673) > at > org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) > at com.sun.proxy.$Proxy20.fetchResults(Unknown Source) > at > org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:451) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:1049) > at > org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553) > at > org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at
[jira] [Updated] (HIVE-14662) Wrong Class Instance When Using Custom SERDE
[ https://issues.apache.org/jira/browse/HIVE-14662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemon Lou updated HIVE-14662: - Attachment: HIVE-14662.patch > Wrong Class Instance When Using Custom SERDE > > > Key: HIVE-14662 > URL: https://issues.apache.org/jira/browse/HIVE-14662 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Reporter: Nemon Lou >Assignee: Nemon Lou > Attachments: HIVE-14662.patch > > > Using [SERDE for > mongoDB|https://github.com/mongodb/mongo-hadoop/blob/master/hive/src/main/java/com/mongodb/hadoop/hive/BSONSerDe.java] > DDL > {noformat} > create external table mytable (ID STRING..) > ROW FORMAT SERDE 'com.mongodb.hadoop.hive.BSONSerDe' > WITH SERDEPROPERTIES('mongo.columns.mapping'='{"ID":"_id",.. }') > STORED AS INPUTFORMAT 'com.mongodb.hadoop.mapred.BSONFileInputFormat' > OUTPUTFORMAT 'com.mongodb.hadoop.hive.output.HiveBSONFileOutputFormat' > LOCATION 'hdfs:///mypath'; > {noformat} > Open beeline and run the following query ,and then open another beeline,run > this again.Then fails. > {noformat} > add jar hdfs:///tmp/mongo-hadoop-hive-1.4.2_new.jar; > add jar hdfs:///tmp/mongo-java-driver-3.0.4.jar; > add jar hdfs:///tmp/mongo-hadoop-core-1.4.2_new.jar; > select * from mytable limit 1; > {noformat} > Error log : > {noformat} > 2016-08-25 09:30:34,475 | WARN | HiveServer2-Handler-Pool: Thread-11972 | > Error fetching results: | > org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:1058) > org.apache.hive.service.cli.HiveSQLException: java.io.IOException: > org.apache.hadoop.hive.serde2.SerDeException: class > com.mongodb.hadoop.hive.BSONSerDerequires a BSONWritable object, notclass > com.mongodb.hadoop.io.BSONWritable > at > org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:366) > at > org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:251) > at > org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:710) > at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) > at > org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) > at > org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1673) > at > org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) > at com.sun.proxy.$Proxy20.fetchResults(Unknown Source) > at > org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:451) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:1049) > at > org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553) > at > org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: org.apache.hadoop.hive.serde2.SerDeException: > class com.mongodb.hadoop.hive.BSONSerDerequires a BSONWritable object, > notclass com.mongodb.hadoop.io.BSONWritable > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414) > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1756) > at > org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:361) > ... 24 more > Caused by:
[jira] [Updated] (HIVE-14662) Wrong Class Instance When Using Custom SERDE
[ https://issues.apache.org/jira/browse/HIVE-14662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemon Lou updated HIVE-14662: - Status: Patch Available (was: Open) > Wrong Class Instance When Using Custom SERDE > > > Key: HIVE-14662 > URL: https://issues.apache.org/jira/browse/HIVE-14662 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Reporter: Nemon Lou >Assignee: Nemon Lou > Attachments: HIVE-14662.patch > > > Using [SERDE for > mongoDB|https://github.com/mongodb/mongo-hadoop/blob/master/hive/src/main/java/com/mongodb/hadoop/hive/BSONSerDe.java] > DDL > {noformat} > create external table mytable (ID STRING..) > ROW FORMAT SERDE 'com.mongodb.hadoop.hive.BSONSerDe' > WITH SERDEPROPERTIES('mongo.columns.mapping'='{"ID":"_id",.. }') > STORED AS INPUTFORMAT 'com.mongodb.hadoop.mapred.BSONFileInputFormat' > OUTPUTFORMAT 'com.mongodb.hadoop.hive.output.HiveBSONFileOutputFormat' > LOCATION 'hdfs:///mypath'; > {noformat} > Open beeline and run the following query ,and then open another beeline,run > this again.Then fails. > {noformat} > add jar hdfs:///tmp/mongo-hadoop-hive-1.4.2_new.jar; > add jar hdfs:///tmp/mongo-java-driver-3.0.4.jar; > add jar hdfs:///tmp/mongo-hadoop-core-1.4.2_new.jar; > select * from mytable limit 1; > {noformat} > Error log : > {noformat} > 2016-08-25 09:30:34,475 | WARN | HiveServer2-Handler-Pool: Thread-11972 | > Error fetching results: | > org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:1058) > org.apache.hive.service.cli.HiveSQLException: java.io.IOException: > org.apache.hadoop.hive.serde2.SerDeException: class > com.mongodb.hadoop.hive.BSONSerDerequires a BSONWritable object, notclass > com.mongodb.hadoop.io.BSONWritable > at > org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:366) > at > org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:251) > at > org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:710) > at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) > at > org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) > at > org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1673) > at > org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) > at com.sun.proxy.$Proxy20.fetchResults(Unknown Source) > at > org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:451) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:1049) > at > org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553) > at > org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: org.apache.hadoop.hive.serde2.SerDeException: > class com.mongodb.hadoop.hive.BSONSerDerequires a BSONWritable object, > notclass com.mongodb.hadoop.io.BSONWritable > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414) > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1756) > at > org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:361) > ... 24 more >
[jira] [Commented] (HIVE-14916) Reduce the memory requirements for Spark tests
[ https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559690#comment-15559690 ] Hive QA commented on HIVE-14916: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832331/HIVE-14916.001.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10663 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[constprog_semijoin] org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[index_bitmap3] org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[index_bitmap_auto] org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[infer_bucket_sort_map_operators] org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[infer_bucket_sort_reducers_power_two] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1441/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1441/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1441/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12832331 - PreCommit-HIVE-Build > Reduce the memory requirements for Spark tests > -- > > Key: HIVE-14916 > URL: https://issues.apache.org/jira/browse/HIVE-14916 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Dapeng Sun > Attachments: HIVE-14916.001.patch > > > As HIVE-14887, we need to reduce the memory requirements for Spark tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-5867) JDBC driver and beeline should support executing an initial SQL script
[ https://issues.apache.org/jira/browse/HIVE-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianguo Tian updated HIVE-5867: --- Comment: was deleted (was: Hive JDBC client) > JDBC driver and beeline should support executing an initial SQL script > -- > > Key: HIVE-5867 > URL: https://issues.apache.org/jira/browse/HIVE-5867 > Project: Hive > Issue Type: Improvement > Components: Clients, JDBC >Reporter: Prasad Mujumdar >Assignee: Jianguo Tian > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-5867.1.patch, HIVE-5867.2.patch, HIVE-5867.3 .patch > > > HiveCLI support the .hiverc script that is executed at the start of the > session. This is helpful for things like registering UDFs, session specific > configs etc. > This functionality is missing for beeline and JDBC clients. It would be > useful for JDBC driver to support an init script with SQL statements that's > automatically executed after connection. The script path can be specified via > JDBC connection URL. For example > {noformat} > jdbc:hive2://localhost:1/default;initScript=/home/user1/scripts/init.sql > {noformat} > This can be added to Beeline's command line option like "-i > /home/user1/scripts/init.sql" > To help transition from HiveCLI to Beeline, we can keep the default init > script as $HOME/.hiverc -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-5867) JDBC driver and beeline should support executing an initial SQL script
[ https://issues.apache.org/jira/browse/HIVE-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianguo Tian updated HIVE-5867: --- Comment: was deleted (was: The "initFile" option in JDBC URL could be seen on the wiki.) > JDBC driver and beeline should support executing an initial SQL script > -- > > Key: HIVE-5867 > URL: https://issues.apache.org/jira/browse/HIVE-5867 > Project: Hive > Issue Type: Improvement > Components: Clients, JDBC >Reporter: Prasad Mujumdar >Assignee: Jianguo Tian > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-5867.1.patch, HIVE-5867.2.patch, HIVE-5867.3 .patch > > > HiveCLI support the .hiverc script that is executed at the start of the > session. This is helpful for things like registering UDFs, session specific > configs etc. > This functionality is missing for beeline and JDBC clients. It would be > useful for JDBC driver to support an init script with SQL statements that's > automatically executed after connection. The script path can be specified via > JDBC connection URL. For example > {noformat} > jdbc:hive2://localhost:1/default;initScript=/home/user1/scripts/init.sql > {noformat} > This can be added to Beeline's command line option like "-i > /home/user1/scripts/init.sql" > To help transition from HiveCLI to Beeline, we can keep the default init > script as $HOME/.hiverc -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-5867) JDBC driver and beeline should support executing an initial SQL script
[ https://issues.apache.org/jira/browse/HIVE-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559594#comment-15559594 ] Jianguo Tian commented on HIVE-5867: I have added "initFile=" option in the JDBC URL, now you can see some changes about "Connection URL Format" and "Connection URL for Remote or Embedded Mode". > JDBC driver and beeline should support executing an initial SQL script > -- > > Key: HIVE-5867 > URL: https://issues.apache.org/jira/browse/HIVE-5867 > Project: Hive > Issue Type: Improvement > Components: Clients, JDBC >Reporter: Prasad Mujumdar >Assignee: Jianguo Tian > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-5867.1.patch, HIVE-5867.2.patch, HIVE-5867.3 .patch > > > HiveCLI support the .hiverc script that is executed at the start of the > session. This is helpful for things like registering UDFs, session specific > configs etc. > This functionality is missing for beeline and JDBC clients. It would be > useful for JDBC driver to support an init script with SQL statements that's > automatically executed after connection. The script path can be specified via > JDBC connection URL. For example > {noformat} > jdbc:hive2://localhost:1/default;initScript=/home/user1/scripts/init.sql > {noformat} > This can be added to Beeline's command line option like "-i > /home/user1/scripts/init.sql" > To help transition from HiveCLI to Beeline, we can keep the default init > script as $HOME/.hiverc -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-5867) JDBC driver and beeline should support executing an initial SQL script
[ https://issues.apache.org/jira/browse/HIVE-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559594#comment-15559594 ] Jianguo Tian edited comment on HIVE-5867 at 10/9/16 8:35 AM: - I have added "initFile=" option in the JDBC URL, now you can see some changes on wiki about "Connection URL Format" and "Connection URL for Remote or Embedded Mode". was (Author: jonnyr): I have added "initFile=" option in the JDBC URL, now you can see some changes about "Connection URL Format" and "Connection URL for Remote or Embedded Mode". > JDBC driver and beeline should support executing an initial SQL script > -- > > Key: HIVE-5867 > URL: https://issues.apache.org/jira/browse/HIVE-5867 > Project: Hive > Issue Type: Improvement > Components: Clients, JDBC >Reporter: Prasad Mujumdar >Assignee: Jianguo Tian > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-5867.1.patch, HIVE-5867.2.patch, HIVE-5867.3 .patch > > > HiveCLI support the .hiverc script that is executed at the start of the > session. This is helpful for things like registering UDFs, session specific > configs etc. > This functionality is missing for beeline and JDBC clients. It would be > useful for JDBC driver to support an init script with SQL statements that's > automatically executed after connection. The script path can be specified via > JDBC connection URL. For example > {noformat} > jdbc:hive2://localhost:1/default;initScript=/home/user1/scripts/init.sql > {noformat} > This can be added to Beeline's command line option like "-i > /home/user1/scripts/init.sql" > To help transition from HiveCLI to Beeline, we can keep the default init > script as $HOME/.hiverc -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14918) Function concat_ws get a wrong value
[ https://issues.apache.org/jira/browse/HIVE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559558#comment-15559558 ] Hive QA commented on HIVE-14918: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832319/HIVE-14918.0.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10663 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching org.apache.hive.spark.client.TestSparkClient.testJobSubmission {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1440/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1440/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1440/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12832319 - PreCommit-HIVE-Build > Function concat_ws get a wrong value > -- > > Key: HIVE-14918 > URL: https://issues.apache.org/jira/browse/HIVE-14918 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 1.1.1, 2.0.0, 2.1.0, 2.0.1 >Reporter: Xiaowei Wang >Assignee: Xiaowei Wang >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-14918.0.patch > > > FROM src INSERT OVERWRITE TABLE dest1 SELECT 'abc', 'xyz', '8675309' WHERE > src.key = 86; > SELECT concat_ws('.',NULL) FROM dest1 ; > The result is a empty string "",but I think it should be return NULL . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14916) Reduce the memory requirements for Spark tests
[ https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dapeng Sun updated HIVE-14916: -- Status: Patch Available (was: Open) Uploaded an inital patch > Reduce the memory requirements for Spark tests > -- > > Key: HIVE-14916 > URL: https://issues.apache.org/jira/browse/HIVE-14916 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Dapeng Sun > Attachments: HIVE-14916.001.patch > > > As HIVE-14887, we need to reduce the memory requirements for Spark tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14916) Reduce the memory requirements for Spark tests
[ https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dapeng Sun updated HIVE-14916: -- Attachment: HIVE-14916.001.patch > Reduce the memory requirements for Spark tests > -- > > Key: HIVE-14916 > URL: https://issues.apache.org/jira/browse/HIVE-14916 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Dapeng Sun > Attachments: HIVE-14916.001.patch > > > As HIVE-14887, we need to reduce the memory requirements for Spark tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-14916) Reduce the memory requirements for Spark tests
[ https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dapeng Sun reassigned HIVE-14916: - Assignee: Dapeng Sun > Reduce the memory requirements for Spark tests > -- > > Key: HIVE-14916 > URL: https://issues.apache.org/jira/browse/HIVE-14916 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Dapeng Sun > > As HIVE-14887, we need to reduce the memory requirements for Spark tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14918) Function concat_ws get a wrong value
[ https://issues.apache.org/jira/browse/HIVE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559471#comment-15559471 ] Xiaowei Wang commented on HIVE-14918: - Is this a problem ? [~pxiong] [~speleato] [~ashutoshc] [~prasanth_j] [~thejas] > Function concat_ws get a wrong value > -- > > Key: HIVE-14918 > URL: https://issues.apache.org/jira/browse/HIVE-14918 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 1.1.1, 2.0.0, 2.1.0, 2.0.1 >Reporter: Xiaowei Wang >Assignee: Xiaowei Wang >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-14918.0.patch > > > FROM src INSERT OVERWRITE TABLE dest1 SELECT 'abc', 'xyz', '8675309' WHERE > src.key = 86; > SELECT concat_ws('.',NULL) FROM dest1 ; > The result is a empty string "",but I think it should be return NULL . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14918) Function concat_ws get a wrong value
[ https://issues.apache.org/jira/browse/HIVE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaowei Wang updated HIVE-14918: Status: Patch Available (was: Open) > Function concat_ws get a wrong value > -- > > Key: HIVE-14918 > URL: https://issues.apache.org/jira/browse/HIVE-14918 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 2.0.1, 2.1.0, 2.0.0, 1.1.1 >Reporter: Xiaowei Wang >Assignee: Xiaowei Wang >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-14918.0.patch > > > FROM src INSERT OVERWRITE TABLE dest1 SELECT 'abc', 'xyz', '8675309' WHERE > src.key = 86; > SELECT concat_ws('.',NULL) FROM dest1 ; > The result is a empty string "",but I think it should be return NULL . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14918) Function concat_ws get a wrong value
[ https://issues.apache.org/jira/browse/HIVE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaowei Wang updated HIVE-14918: Attachment: HIVE-14918.0.patch > Function concat_ws get a wrong value > -- > > Key: HIVE-14918 > URL: https://issues.apache.org/jira/browse/HIVE-14918 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 1.1.1, 2.0.0, 2.1.0, 2.0.1 >Reporter: Xiaowei Wang >Assignee: Xiaowei Wang >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-14918.0.patch > > > FROM src INSERT OVERWRITE TABLE dest1 SELECT 'abc', 'xyz', '8675309' WHERE > src.key = 86; > SELECT concat_ws('.',NULL) FROM dest1 ; > The result is a empty string "",but I think it should be return NULL . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-14632) beeline outputformat needs better documentation
[ https://issues.apache.org/jira/browse/HIVE-14632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559375#comment-15559375 ] Lefty Leverenz edited comment on HIVE-14632 at 10/9/16 6:19 AM: Good documentation, thanks [~kuczoram]! I made some minor edits. Very cool expandable examples -- I hadn't realized we can do that. One question: is the misalignment of the 'comment' column in the tsv example accurate? I assume it's due to the tab stops because the 'value' column has values longer than the column name, but just wanted to check. +1 but a review by [~michaelthoward] would also be good, as well as a technical review by [~szehon] or [~thejas]. was (Author: le...@hortonworks.com): Good documentation, thanks [~kuczoram]! I made some minor edits. Very cool expandable examples -- I hadn't realized we can do that. One question: is the misalignment of the 'comment' column in the tsv example accurate? I assume it's due to the tab stops because the 'value' column has values longer than the column name, but just wanted to check. +1 but a technical review by [~michaelthoward] or [~thejas] would also be good. > beeline outputformat needs better documentation > --- > > Key: HIVE-14632 > URL: https://issues.apache.org/jira/browse/HIVE-14632 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 0.14.0 > Environment: Hive HiveServer2 wiki >Reporter: Michael Howard >Assignee: Marta Kuczora > > SUMMARY > * need better wiki page doc for beeline outputformat option > * should explicitly say that "double quote characters" are used to enclose > fields which need enclosing. > * Should describe the treatment of embedded double quote chars as "doubled" > DETAIL > The page at: > https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-Separated-ValueOutputFormats > describes separated value outputformats csv/tsv/csv2/tsv2, etc. > I found doc to be inadequate and terminology to be confusing. > > These conform better to standard CSV convention, which adds quotes around a > > cell value > What kind of quotes? The only reference to quotes in this section refers to > single quotes for the deprecated csv/tsv format. > The JIRA at > https://issues.apache.org/jira/browse/HIVE-8615 > clarifies a bit: > - Old format quoted every field. New format quotes only fields that contain a > delimiter or the quoting char. > - Old format quoted using single quotes, new format quotes using double > quotes > - Old format didn't escape quotes in a field (a bug). New format does escape > the quotes > However, neither this JIRA page nor the wiki page doc define what is meant by > "escaping the quotes". > Q: In this context, does escaping mean "backslash escaping" or "double > embedded double quotes" or something else? > Investigation of source code reveals that this is using SuperCSV. > SuperCSV does not support backslash-escape of embedded quotes. See last line > of: > https://super-csv.github.io/super-csv/csv_specification.html > THE END -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8615) beeline csv,tsv outputformat needs backward compatibility mode
[ https://issues.apache.org/jira/browse/HIVE-8615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559379#comment-15559379 ] Lefty Leverenz commented on HIVE-8615: -- [~kuczoram] improved the documentation (for HIVE-14632). * [HiveServer2 Clients -- Beeline -- Output Formats | https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-OutputFormats] > beeline csv,tsv outputformat needs backward compatibility mode > -- > > Key: HIVE-8615 > URL: https://issues.apache.org/jira/browse/HIVE-8615 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 0.14.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair >Priority: Critical > Fix For: 0.14.0 > > Attachments: HIVE-8615.1.patch, HIVE-8615.2.patch > > > Changes in HIVE-7390 break backward compatibility for beeline csv and tsv > formats. > This can cause problems for users upgrading to hive 0.14, if they have code > for parsing the old output format. Instead of removing the old format in this > release, we should consider it deprecated and support it in a few releases > before removing it completely. > Incompatible Changes in the tsv and csv formats- > - Old format quoted every field. New format quotes only fields that contain a > delimiter or the quoteing char. > - Old format quoted using single quotes, new format quotes using double quotes > - Old format didn't escape quotes in a field (a bug). New format does escape > the quotes -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14632) beeline outputformat needs better documentation
[ https://issues.apache.org/jira/browse/HIVE-14632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559375#comment-15559375 ] Lefty Leverenz commented on HIVE-14632: --- Good documentation, thanks [~kuczoram]! I made some minor edits. Very cool expandable examples -- I hadn't realized we can do that. One question: is the misalignment of the 'comment' column in the tsv example accurate? I assume it's due to the tab stops because the 'value' column has values longer than the column name, but just wanted to check. +1 but a technical review by [~michaelthoward] or [~thejas] would also be good. > beeline outputformat needs better documentation > --- > > Key: HIVE-14632 > URL: https://issues.apache.org/jira/browse/HIVE-14632 > Project: Hive > Issue Type: Improvement > Components: Beeline >Affects Versions: 0.14.0 > Environment: Hive HiveServer2 wiki >Reporter: Michael Howard >Assignee: Marta Kuczora > > SUMMARY > * need better wiki page doc for beeline outputformat option > * should explicitly say that "double quote characters" are used to enclose > fields which need enclosing. > * Should describe the treatment of embedded double quote chars as "doubled" > DETAIL > The page at: > https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-Separated-ValueOutputFormats > describes separated value outputformats csv/tsv/csv2/tsv2, etc. > I found doc to be inadequate and terminology to be confusing. > > These conform better to standard CSV convention, which adds quotes around a > > cell value > What kind of quotes? The only reference to quotes in this section refers to > single quotes for the deprecated csv/tsv format. > The JIRA at > https://issues.apache.org/jira/browse/HIVE-8615 > clarifies a bit: > - Old format quoted every field. New format quotes only fields that contain a > delimiter or the quoting char. > - Old format quoted using single quotes, new format quotes using double > quotes > - Old format didn't escape quotes in a field (a bug). New format does escape > the quotes > However, neither this JIRA page nor the wiki page doc define what is meant by > "escaping the quotes". > Q: In this context, does escaping mean "backslash escaping" or "double > embedded double quotes" or something else? > Investigation of source code reveals that this is using SuperCSV. > SuperCSV does not support backslash-escape of embedded quotes. See last line > of: > https://super-csv.github.io/super-csv/csv_specification.html > THE END -- This message was sent by Atlassian JIRA (v6.3.4#6332)