[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others

2017-07-24 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16099476#comment-16099476
 ] 

Lefty Leverenz commented on HIVE-16222:
---

Doc note:  This adds *hive.vectorized.row.serde.inputformat.excludes* to 
HiveConf.java and changes the default value of 
*hive.vectorized.use.row.serde.deserialize* to true, so the wiki needs to be 
updated.

* [Configuration Properties -- Vectorization | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Vectorization]
** [hive.vectorized.use.row.serde.deserialize | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.vectorized.use.row.serde.deserialize]

Added a TODOC3.0 label.

(Welcome back, Sergey.)

> add a setting to disable row.serde for specific formats; enable for others
> --
>
> Key: HIVE-16222
> URL: https://issues.apache.org/jira/browse/HIVE-16222
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, 
> HIVE-16222.03.patch, HIVE-16222.04.patch, HIVE-16222.05.patch, 
> HIVE-16222.patch
>
>
> Per [~gopalv]
> {quote}
> row.serde = true ... breaks Parquet (they expect to get the same object back, 
> which means you can't buffer 1024 rows).
> {quote}
> We want to enable this and vector.serde for text vectorization. Need to turn 
> it off for specific formats.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others

2017-07-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16099261#comment-16099261
 ] 

Sergey Shelukhin commented on HIVE-16222:
-

Looks like I forgot to push this

> add a setting to disable row.serde for specific formats; enable for others
> --
>
> Key: HIVE-16222
> URL: https://issues.apache.org/jira/browse/HIVE-16222
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, 
> HIVE-16222.03.patch, HIVE-16222.04.patch, HIVE-16222.05.patch, 
> HIVE-16222.patch
>
>
> Per [~gopalv]
> {quote}
> row.serde = true ... breaks Parquet (they expect to get the same object back, 
> which means you can't buffer 1024 rows).
> {quote}
> We want to enable this and vector.serde for text vectorization. Need to turn 
> it off for specific formats.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others

2017-07-20 Thread Daniel Voros (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16094705#comment-16094705
 ] 

Daniel Voros commented on HIVE-16222:
-

[~sershe], [~leftylev] is right, this hasn't been committed yet.

> add a setting to disable row.serde for specific formats; enable for others
> --
>
> Key: HIVE-16222
> URL: https://issues.apache.org/jira/browse/HIVE-16222
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, 
> HIVE-16222.03.patch, HIVE-16222.04.patch, HIVE-16222.05.patch, 
> HIVE-16222.patch
>
>
> Per [~gopalv]
> {quote}
> row.serde = true ... breaks Parquet (they expect to get the same object back, 
> which means you can't buffer 1024 rows).
> {quote}
> We want to enable this and vector.serde for text vectorization. Need to turn 
> it off for specific formats.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others

2017-07-03 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072092#comment-16072092
 ] 

Lefty Leverenz commented on HIVE-16222:
---

[~sershe], I don't see this commit in email or github.

> add a setting to disable row.serde for specific formats; enable for others
> --
>
> Key: HIVE-16222
> URL: https://issues.apache.org/jira/browse/HIVE-16222
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, 
> HIVE-16222.03.patch, HIVE-16222.04.patch, HIVE-16222.05.patch, 
> HIVE-16222.patch
>
>
> Per [~gopalv]
> {quote}
> row.serde = true ... breaks Parquet (they expect to get the same object back, 
> which means you can't buffer 1024 rows).
> {quote}
> We want to enable this and vector.serde for text vectorization. Need to turn 
> it off for specific formats.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others

2017-06-30 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070548#comment-16070548
 ] 

Matt McCline commented on HIVE-16222:
-

+1 LGTM

> add a setting to disable row.serde for specific formats; enable for others
> --
>
> Key: HIVE-16222
> URL: https://issues.apache.org/jira/browse/HIVE-16222
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, 
> HIVE-16222.03.patch, HIVE-16222.04.patch, HIVE-16222.05.patch, 
> HIVE-16222.patch
>
>
> Per [~gopalv]
> {quote}
> row.serde = true ... breaks Parquet (they expect to get the same object back, 
> which means you can't buffer 1024 rows).
> {quote}
> We want to enable this and vector.serde for text vectorization. Need to turn 
> it off for specific formats.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others

2017-06-30 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070422#comment-16070422
 ] 

Sergey Shelukhin commented on HIVE-16222:
-

[~mmccline] ping?

> add a setting to disable row.serde for specific formats; enable for others
> --
>
> Key: HIVE-16222
> URL: https://issues.apache.org/jira/browse/HIVE-16222
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, 
> HIVE-16222.03.patch, HIVE-16222.04.patch, HIVE-16222.05.patch, 
> HIVE-16222.patch
>
>
> Per [~gopalv]
> {quote}
> row.serde = true ... breaks Parquet (they expect to get the same object back, 
> which means you can't buffer 1024 rows).
> {quote}
> We want to enable this and vector.serde for text vectorization. Need to turn 
> it off for specific formats.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others

2017-06-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16067352#comment-16067352
 ] 

Hive QA commented on HIVE-16222:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12874943/HIVE-16222.05.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10851 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=238)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main]
 (batchId=150)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=233)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication
 (batchId=217)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication
 (batchId=217)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS
 (batchId=217)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=178)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5817/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5817/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5817/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12874943 - PreCommit-HIVE-Build

> add a setting to disable row.serde for specific formats; enable for others
> --
>
> Key: HIVE-16222
> URL: https://issues.apache.org/jira/browse/HIVE-16222
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, 
> HIVE-16222.03.patch, HIVE-16222.04.patch, HIVE-16222.05.patch, 
> HIVE-16222.patch
>
>
> Per [~gopalv]
> {quote}
> row.serde = true ... breaks Parquet (they expect to get the same object back, 
> which means you can't buffer 1024 rows).
> {quote}
> We want to enable this and vector.serde for text vectorization. Need to turn 
> it off for specific formats.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others

2017-06-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065939#comment-16065939
 ] 

Hive QA commented on HIVE-16222:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12874748/HIVE-16222.04.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 10851 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=238)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round] 
(batchId=34)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_round]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_bucketmapjoin1]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] 
(batchId=233)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] 
(batchId=233)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication
 (batchId=217)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication
 (batchId=217)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS
 (batchId=217)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=178)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5798/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5798/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5798/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12874748 - PreCommit-HIVE-Build

> add a setting to disable row.serde for specific formats; enable for others
> --
>
> Key: HIVE-16222
> URL: https://issues.apache.org/jira/browse/HIVE-16222
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, 
> HIVE-16222.03.patch, HIVE-16222.04.patch, HIVE-16222.patch
>
>
> Per [~gopalv]
> {quote}
> row.serde = true ... breaks Parquet (they expect to get the same object back, 
> which means you can't buffer 1024 rows).
> {quote}
> We want to enable this and vector.serde for text vectorization. Need to turn 
> it off for specific formats.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others

2017-06-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065872#comment-16065872
 ] 

Hive QA commented on HIVE-16222:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12874748/HIVE-16222.04.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 10851 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round] 
(batchId=34)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_round]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_bucketmapjoin1]
 (batchId=149)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] 
(batchId=233)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=233)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] 
(batchId=233)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication
 (batchId=217)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication
 (batchId=217)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS
 (batchId=217)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=178)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5796/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5796/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5796/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12874748 - PreCommit-HIVE-Build

> add a setting to disable row.serde for specific formats; enable for others
> --
>
> Key: HIVE-16222
> URL: https://issues.apache.org/jira/browse/HIVE-16222
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, 
> HIVE-16222.03.patch, HIVE-16222.04.patch, HIVE-16222.patch
>
>
> Per [~gopalv]
> {quote}
> row.serde = true ... breaks Parquet (they expect to get the same object back, 
> which means you can't buffer 1024 rows).
> {quote}
> We want to enable this and vector.serde for text vectorization. Need to turn 
> it off for specific formats.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others

2017-06-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065439#comment-16065439
 ] 

Sergey Shelukhin commented on HIVE-16222:
-

Actually is this patch even relevant? Upon adding the test I see that Parquet 
is now natively vectorized. Was there a specific problem/failure when enabling 
row.serde for all formats?

> add a setting to disable row.serde for specific formats; enable for others
> --
>
> Key: HIVE-16222
> URL: https://issues.apache.org/jira/browse/HIVE-16222
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, 
> HIVE-16222.03.patch, HIVE-16222.patch
>
>
> Per [~gopalv]
> {quote}
> row.serde = true ... breaks Parquet (they expect to get the same object back, 
> which means you can't buffer 1024 rows).
> {quote}
> We want to enable this and vector.serde for text vectorization. Need to turn 
> it off for specific formats.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others

2017-03-24 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15941324#comment-15941324
 ] 

Matt McCline commented on HIVE-16222:
-

Or:

{code}
 enabled: false
 enabledConditionsNotMet: hive.vectorized.use.row.serde.deserialize IS 
true AND org.apache.parquet.hadoop.ParquetInputFormat NOT IN 
hive.vectorized.row.serde.inputformat.excludes 
[org.apache.parquet.hadoop.ParquetInputFormat] IS false
 groupByVectorOutput: true
 inputFileFormats: org.apache.parquet.hadoop.ParquetInputFormat 
{code}


> add a setting to disable row.serde for specific formats; enable for others
> --
>
> Key: HIVE-16222
> URL: https://issues.apache.org/jira/browse/HIVE-16222
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, 
> HIVE-16222.03.patch, HIVE-16222.patch
>
>
> Per [~gopalv]
> {quote}
> row.serde = true ... breaks Parquet (they expect to get the same object back, 
> which means you can't buffer 1024 rows).
> {quote}
> We want to enable this and vector.serde for text vectorization. Need to turn 
> it off for specific formats.
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others

2017-03-24 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15941033#comment-15941033
 ] 

Matt McCline commented on HIVE-16222:
-

[~sershe] I'd like to see some Q file cases showing an attempt to use the 
org.apache.parquet.hadoop.ParquetInputFormat input file format and it not being 
allowed.

And detail in the not met conditions showing why row-mode was not used...
{code}
 enabled: false
 enabledConditionsNotMet: hive.vectorized.use.row.serde.deserialize IS 
true AND hive.vectorized.row.serde.inputformat.excludes 
[org.apache.parquet.hadoop.ParquetInputFormat] NOT CONTAINS 
org.apache.parquet.hadoop.ParquetInputFormat IS false
 groupByVectorOutput: true
 inputFileFormats: org.apache.parquet.hadoop.ParquetInputFormat 
{code}

> add a setting to disable row.serde for specific formats; enable for others
> --
>
> Key: HIVE-16222
> URL: https://issues.apache.org/jira/browse/HIVE-16222
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, 
> HIVE-16222.03.patch, HIVE-16222.patch
>
>
> Per [~gopalv]
> {quote}
> row.serde = true ... breaks Parquet (they expect to get the same object back, 
> which means you can't buffer 1024 rows).
> {quote}
> We want to enable this and vector.serde for text vectorization. Need to turn 
> it off for specific formats.
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others

2017-03-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939169#comment-15939169
 ] 

Hive QA commented on HIVE-16222:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12860208/HIVE-16222.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10507 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=233)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4320/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4320/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4320/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12860208 - PreCommit-HIVE-Build

> add a setting to disable row.serde for specific formats; enable for others
> --
>
> Key: HIVE-16222
> URL: https://issues.apache.org/jira/browse/HIVE-16222
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, 
> HIVE-16222.03.patch, HIVE-16222.patch
>
>
> Per [~gopalv]
> {quote}
> row.serde = true ... breaks Parquet (they expect to get the same object back, 
> which means you can't buffer 1024 rows).
> {quote}
> We want to enable this and vector.serde for text vectorization. Need to turn 
> it off for specific formats.
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others

2017-03-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15937700#comment-15937700
 ] 

Hive QA commented on HIVE-16222:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12860030/HIVE-16222.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10509 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round] 
(batchId=33)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_round]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_bucketmapjoin1]
 (batchId=144)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4303/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4303/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4303/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12860030 - PreCommit-HIVE-Build

> add a setting to disable row.serde for specific formats; enable for others
> --
>
> Key: HIVE-16222
> URL: https://issues.apache.org/jira/browse/HIVE-16222
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, 
> HIVE-16222.patch
>
>
> Per [~gopalv]
> {quote}
> row.serde = true ... breaks Parquet (they expect to get the same object back, 
> which means you can't buffer 1024 rows).
> {quote}
> We want to enable this and vector.serde for text vectorization. Need to turn 
> it off for specific formats.
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others

2017-03-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15935626#comment-15935626
 ] 

Hive QA commented on HIVE-16222:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12859800/HIVE-16222.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 45 failed/errored test(s), 10496 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mergejoin] (batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_int_type_promotion] 
(batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[structin] (batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[tez_join_hash] 
(batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_bucket] 
(batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_cast_constant] 
(batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_2] 
(batchId=64)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round] 
(batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby4] 
(batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby6] 
(batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_mapjoin] 
(batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce] 
(batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_mapjoin_reduce] 
(batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_mr_diff_schema_alias]
 (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_orderby_5] 
(batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce_groupby_decimal]
 (batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_string_concat] 
(batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_tablesample_rows] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_udf_character_length]
 (batchId=76)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_udf_octet_length] 
(batchId=2)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_13] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_14] 
(batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_15] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_limit] 
(batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_date_funcs] 
(batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_parquet_types]
 (batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_shufflejoin] 
(batchId=68)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_opt_vectorization]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization2]
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] 
(batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_join_hash]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_vector_dynpart_hashjoin_2]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_bucket]
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_round]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_mapjoin]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_mapjoin_reduce]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_udf_character_length]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_udf_octet_length]
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_bucketmapjoin1]
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_dynamic_partition_pruning]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_join46]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_parquet_types]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_mapjoin_reduce]
 (batchId=130)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4277/testReport
Console output: https://builds.apache.org/jo

[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others

2017-03-15 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15927172#comment-15927172
 ] 

Matt McCline commented on HIVE-16222:
-

ROW_DESERIALIZE_INPUT_FORMAT_EXCLUDE_LIST might be a better name.

> add a setting to disable row.serde for specific formats; enable for others
> --
>
> Key: HIVE-16222
> URL: https://issues.apache.org/jira/browse/HIVE-16222
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16222.patch
>
>
> Per [~gopalv]
> {quote}
> row.serde = true ... breaks Parquet (they expect to get the same object back, 
> which means you can't buffer 1024 rows).
> {quote}
> We want to enable this and vector.serde for text vectorization. Need to turn 
> it off for specific formats.
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others

2017-03-15 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15927143#comment-15927143
 ] 

Gopal V commented on HIVE-16222:


ROW_DESERIALIZE_BLACKLIST could be a config.

That part will need to be undone in the patch fixing Parquet - +1 pending a 
clean run.

> add a setting to disable row.serde for specific formats; enable for others
> --
>
> Key: HIVE-16222
> URL: https://issues.apache.org/jira/browse/HIVE-16222
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16222.patch
>
>
> Per [~gopalv]
> {quote}
> row.serde = true ... breaks Parquet (they expect to get the same object back, 
> which means you can't buffer 1024 rows).
> {quote}
> We want to enable this and vector.serde for text vectorization. Need to turn 
> it off for specific formats.
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others

2017-03-15 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15926681#comment-15926681
 ] 

Sergey Shelukhin commented on HIVE-16222:
-

cc [~mmccline]

> add a setting to disable row.serde for specific formats; enable for others
> --
>
> Key: HIVE-16222
> URL: https://issues.apache.org/jira/browse/HIVE-16222
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Per [~gopalv]
> {quote}
> row.serde = true ... breaks Parquet (they expect to get the same object back, 
> which means you can't buffer 1024 rows).
> {quote}
> We want to enable this and vector.serde for text vectorization. Need to turn 
> it off for specific formats.
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)