[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others
[ https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16099476#comment-16099476 ] Lefty Leverenz commented on HIVE-16222: --- Doc note: This adds *hive.vectorized.row.serde.inputformat.excludes* to HiveConf.java and changes the default value of *hive.vectorized.use.row.serde.deserialize* to true, so the wiki needs to be updated. * [Configuration Properties -- Vectorization | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Vectorization] ** [hive.vectorized.use.row.serde.deserialize | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.vectorized.use.row.serde.deserialize] Added a TODOC3.0 label. (Welcome back, Sergey.) > add a setting to disable row.serde for specific formats; enable for others > -- > > Key: HIVE-16222 > URL: https://issues.apache.org/jira/browse/HIVE-16222 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, > HIVE-16222.03.patch, HIVE-16222.04.patch, HIVE-16222.05.patch, > HIVE-16222.patch > > > Per [~gopalv] > {quote} > row.serde = true ... breaks Parquet (they expect to get the same object back, > which means you can't buffer 1024 rows). > {quote} > We want to enable this and vector.serde for text vectorization. Need to turn > it off for specific formats. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others
[ https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16099261#comment-16099261 ] Sergey Shelukhin commented on HIVE-16222: - Looks like I forgot to push this > add a setting to disable row.serde for specific formats; enable for others > -- > > Key: HIVE-16222 > URL: https://issues.apache.org/jira/browse/HIVE-16222 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 3.0.0 > > Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, > HIVE-16222.03.patch, HIVE-16222.04.patch, HIVE-16222.05.patch, > HIVE-16222.patch > > > Per [~gopalv] > {quote} > row.serde = true ... breaks Parquet (they expect to get the same object back, > which means you can't buffer 1024 rows). > {quote} > We want to enable this and vector.serde for text vectorization. Need to turn > it off for specific formats. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others
[ https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16094705#comment-16094705 ] Daniel Voros commented on HIVE-16222: - [~sershe], [~leftylev] is right, this hasn't been committed yet. > add a setting to disable row.serde for specific formats; enable for others > -- > > Key: HIVE-16222 > URL: https://issues.apache.org/jira/browse/HIVE-16222 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 3.0.0 > > Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, > HIVE-16222.03.patch, HIVE-16222.04.patch, HIVE-16222.05.patch, > HIVE-16222.patch > > > Per [~gopalv] > {quote} > row.serde = true ... breaks Parquet (they expect to get the same object back, > which means you can't buffer 1024 rows). > {quote} > We want to enable this and vector.serde for text vectorization. Need to turn > it off for specific formats. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others
[ https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072092#comment-16072092 ] Lefty Leverenz commented on HIVE-16222: --- [~sershe], I don't see this commit in email or github. > add a setting to disable row.serde for specific formats; enable for others > -- > > Key: HIVE-16222 > URL: https://issues.apache.org/jira/browse/HIVE-16222 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 3.0.0 > > Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, > HIVE-16222.03.patch, HIVE-16222.04.patch, HIVE-16222.05.patch, > HIVE-16222.patch > > > Per [~gopalv] > {quote} > row.serde = true ... breaks Parquet (they expect to get the same object back, > which means you can't buffer 1024 rows). > {quote} > We want to enable this and vector.serde for text vectorization. Need to turn > it off for specific formats. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others
[ https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070548#comment-16070548 ] Matt McCline commented on HIVE-16222: - +1 LGTM > add a setting to disable row.serde for specific formats; enable for others > -- > > Key: HIVE-16222 > URL: https://issues.apache.org/jira/browse/HIVE-16222 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, > HIVE-16222.03.patch, HIVE-16222.04.patch, HIVE-16222.05.patch, > HIVE-16222.patch > > > Per [~gopalv] > {quote} > row.serde = true ... breaks Parquet (they expect to get the same object back, > which means you can't buffer 1024 rows). > {quote} > We want to enable this and vector.serde for text vectorization. Need to turn > it off for specific formats. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others
[ https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070422#comment-16070422 ] Sergey Shelukhin commented on HIVE-16222: - [~mmccline] ping? > add a setting to disable row.serde for specific formats; enable for others > -- > > Key: HIVE-16222 > URL: https://issues.apache.org/jira/browse/HIVE-16222 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, > HIVE-16222.03.patch, HIVE-16222.04.patch, HIVE-16222.05.patch, > HIVE-16222.patch > > > Per [~gopalv] > {quote} > row.serde = true ... breaks Parquet (they expect to get the same object back, > which means you can't buffer 1024 rows). > {quote} > We want to enable this and vector.serde for text vectorization. Need to turn > it off for specific formats. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others
[ https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16067352#comment-16067352 ] Hive QA commented on HIVE-16222: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12874943/HIVE-16222.05.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10851 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=238) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main] (batchId=150) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=233) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS (batchId=217) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=178) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5817/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5817/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5817/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12874943 - PreCommit-HIVE-Build > add a setting to disable row.serde for specific formats; enable for others > -- > > Key: HIVE-16222 > URL: https://issues.apache.org/jira/browse/HIVE-16222 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, > HIVE-16222.03.patch, HIVE-16222.04.patch, HIVE-16222.05.patch, > HIVE-16222.patch > > > Per [~gopalv] > {quote} > row.serde = true ... breaks Parquet (they expect to get the same object back, > which means you can't buffer 1024 rows). > {quote} > We want to enable this and vector.serde for text vectorization. Need to turn > it off for specific formats. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others
[ https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065939#comment-16065939 ] Hive QA commented on HIVE-16222: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12874748/HIVE-16222.04.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 10851 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=238) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round] (batchId=34) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_round] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_bucketmapjoin1] (batchId=149) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=99) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] (batchId=233) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] (batchId=233) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS (batchId=217) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=178) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5798/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5798/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5798/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12874748 - PreCommit-HIVE-Build > add a setting to disable row.serde for specific formats; enable for others > -- > > Key: HIVE-16222 > URL: https://issues.apache.org/jira/browse/HIVE-16222 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, > HIVE-16222.03.patch, HIVE-16222.04.patch, HIVE-16222.patch > > > Per [~gopalv] > {quote} > row.serde = true ... breaks Parquet (they expect to get the same object back, > which means you can't buffer 1024 rows). > {quote} > We want to enable this and vector.serde for text vectorization. Need to turn > it off for specific formats. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others
[ https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065872#comment-16065872 ] Hive QA commented on HIVE-16222: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12874748/HIVE-16222.04.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 10851 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round] (batchId=34) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_round] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_bucketmapjoin1] (batchId=149) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] (batchId=233) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=233) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] (batchId=233) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS (batchId=217) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=178) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5796/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5796/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5796/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12874748 - PreCommit-HIVE-Build > add a setting to disable row.serde for specific formats; enable for others > -- > > Key: HIVE-16222 > URL: https://issues.apache.org/jira/browse/HIVE-16222 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, > HIVE-16222.03.patch, HIVE-16222.04.patch, HIVE-16222.patch > > > Per [~gopalv] > {quote} > row.serde = true ... breaks Parquet (they expect to get the same object back, > which means you can't buffer 1024 rows). > {quote} > We want to enable this and vector.serde for text vectorization. Need to turn > it off for specific formats. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others
[ https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065439#comment-16065439 ] Sergey Shelukhin commented on HIVE-16222: - Actually is this patch even relevant? Upon adding the test I see that Parquet is now natively vectorized. Was there a specific problem/failure when enabling row.serde for all formats? > add a setting to disable row.serde for specific formats; enable for others > -- > > Key: HIVE-16222 > URL: https://issues.apache.org/jira/browse/HIVE-16222 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, > HIVE-16222.03.patch, HIVE-16222.patch > > > Per [~gopalv] > {quote} > row.serde = true ... breaks Parquet (they expect to get the same object back, > which means you can't buffer 1024 rows). > {quote} > We want to enable this and vector.serde for text vectorization. Need to turn > it off for specific formats. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others
[ https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15941324#comment-15941324 ] Matt McCline commented on HIVE-16222: - Or: {code} enabled: false enabledConditionsNotMet: hive.vectorized.use.row.serde.deserialize IS true AND org.apache.parquet.hadoop.ParquetInputFormat NOT IN hive.vectorized.row.serde.inputformat.excludes [org.apache.parquet.hadoop.ParquetInputFormat] IS false groupByVectorOutput: true inputFileFormats: org.apache.parquet.hadoop.ParquetInputFormat {code} > add a setting to disable row.serde for specific formats; enable for others > -- > > Key: HIVE-16222 > URL: https://issues.apache.org/jira/browse/HIVE-16222 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, > HIVE-16222.03.patch, HIVE-16222.patch > > > Per [~gopalv] > {quote} > row.serde = true ... breaks Parquet (they expect to get the same object back, > which means you can't buffer 1024 rows). > {quote} > We want to enable this and vector.serde for text vectorization. Need to turn > it off for specific formats. > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others
[ https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15941033#comment-15941033 ] Matt McCline commented on HIVE-16222: - [~sershe] I'd like to see some Q file cases showing an attempt to use the org.apache.parquet.hadoop.ParquetInputFormat input file format and it not being allowed. And detail in the not met conditions showing why row-mode was not used... {code} enabled: false enabledConditionsNotMet: hive.vectorized.use.row.serde.deserialize IS true AND hive.vectorized.row.serde.inputformat.excludes [org.apache.parquet.hadoop.ParquetInputFormat] NOT CONTAINS org.apache.parquet.hadoop.ParquetInputFormat IS false groupByVectorOutput: true inputFileFormats: org.apache.parquet.hadoop.ParquetInputFormat {code} > add a setting to disable row.serde for specific formats; enable for others > -- > > Key: HIVE-16222 > URL: https://issues.apache.org/jira/browse/HIVE-16222 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, > HIVE-16222.03.patch, HIVE-16222.patch > > > Per [~gopalv] > {quote} > row.serde = true ... breaks Parquet (they expect to get the same object back, > which means you can't buffer 1024 rows). > {quote} > We want to enable this and vector.serde for text vectorization. Need to turn > it off for specific formats. > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others
[ https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939169#comment-15939169 ] Hive QA commented on HIVE-16222: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12860208/HIVE-16222.03.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10507 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35) org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver (batchId=233) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4320/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4320/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4320/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12860208 - PreCommit-HIVE-Build > add a setting to disable row.serde for specific formats; enable for others > -- > > Key: HIVE-16222 > URL: https://issues.apache.org/jira/browse/HIVE-16222 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, > HIVE-16222.03.patch, HIVE-16222.patch > > > Per [~gopalv] > {quote} > row.serde = true ... breaks Parquet (they expect to get the same object back, > which means you can't buffer 1024 rows). > {quote} > We want to enable this and vector.serde for text vectorization. Need to turn > it off for specific formats. > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others
[ https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15937700#comment-15937700 ] Hive QA commented on HIVE-16222: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12860030/HIVE-16222.02.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10509 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round] (batchId=33) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_round] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_bucketmapjoin1] (batchId=144) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4303/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4303/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4303/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12860030 - PreCommit-HIVE-Build > add a setting to disable row.serde for specific formats; enable for others > -- > > Key: HIVE-16222 > URL: https://issues.apache.org/jira/browse/HIVE-16222 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, > HIVE-16222.patch > > > Per [~gopalv] > {quote} > row.serde = true ... breaks Parquet (they expect to get the same object back, > which means you can't buffer 1024 rows). > {quote} > We want to enable this and vector.serde for text vectorization. Need to turn > it off for specific formats. > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others
[ https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15935626#comment-15935626 ] Hive QA commented on HIVE-16222: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12859800/HIVE-16222.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 45 failed/errored test(s), 10496 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mergejoin] (batchId=56) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_int_type_promotion] (batchId=39) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[structin] (batchId=30) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[tez_join_hash] (batchId=49) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_bucket] (batchId=24) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_cast_constant] (batchId=8) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_2] (batchId=64) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round] (batchId=33) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby4] (batchId=14) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby6] (batchId=81) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_mapjoin] (batchId=70) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce] (batchId=52) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_mapjoin_reduce] (batchId=73) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_mr_diff_schema_alias] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_orderby_5] (batchId=39) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce_groupby_decimal] (batchId=30) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_string_concat] (batchId=30) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_tablesample_rows] (batchId=48) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_udf_character_length] (batchId=76) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_udf_octet_length] (batchId=2) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_13] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_14] (batchId=14) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_15] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_limit] (batchId=34) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_date_funcs] (batchId=71) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_parquet_types] (batchId=62) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_shufflejoin] (batchId=68) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_opt_vectorization] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization2] (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_join_hash] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_vector_dynpart_hashjoin_2] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_bucket] (batchId=144) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_round] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_mapjoin] (batchId=154) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_mapjoin_reduce] (batchId=155) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_udf_character_length] (batchId=155) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_udf_octet_length] (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_bucketmapjoin1] (batchId=144) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_dynamic_partition_pruning] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_join46] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_parquet_types] (batchId=152) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=94) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_mapjoin_reduce] (batchId=130) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4277/testReport Console output: https://builds.apache.org/jo
[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others
[ https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15927172#comment-15927172 ] Matt McCline commented on HIVE-16222: - ROW_DESERIALIZE_INPUT_FORMAT_EXCLUDE_LIST might be a better name. > add a setting to disable row.serde for specific formats; enable for others > -- > > Key: HIVE-16222 > URL: https://issues.apache.org/jira/browse/HIVE-16222 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16222.patch > > > Per [~gopalv] > {quote} > row.serde = true ... breaks Parquet (they expect to get the same object back, > which means you can't buffer 1024 rows). > {quote} > We want to enable this and vector.serde for text vectorization. Need to turn > it off for specific formats. > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others
[ https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15927143#comment-15927143 ] Gopal V commented on HIVE-16222: ROW_DESERIALIZE_BLACKLIST could be a config. That part will need to be undone in the patch fixing Parquet - +1 pending a clean run. > add a setting to disable row.serde for specific formats; enable for others > -- > > Key: HIVE-16222 > URL: https://issues.apache.org/jira/browse/HIVE-16222 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16222.patch > > > Per [~gopalv] > {quote} > row.serde = true ... breaks Parquet (they expect to get the same object back, > which means you can't buffer 1024 rows). > {quote} > We want to enable this and vector.serde for text vectorization. Need to turn > it off for specific formats. > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others
[ https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15926681#comment-15926681 ] Sergey Shelukhin commented on HIVE-16222: - cc [~mmccline] > add a setting to disable row.serde for specific formats; enable for others > -- > > Key: HIVE-16222 > URL: https://issues.apache.org/jira/browse/HIVE-16222 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > > Per [~gopalv] > {quote} > row.serde = true ... breaks Parquet (they expect to get the same object back, > which means you can't buffer 1024 rows). > {quote} > We want to enable this and vector.serde for text vectorization. Need to turn > it off for specific formats. > -- This message was sent by Atlassian JIRA (v6.3.15#6346)