[jira] [Commented] (HIVE-12362) Hive's Parquet SerDe ignores 'serialization.null.format' property

2016-08-31 Thread Lenni Kuff (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15453014#comment-15453014
 ] 

Lenni Kuff commented on HIVE-12362:
---

I don't have a test case available to confirm this, it was only done by looking 
at the code so have not confirmed. Seems that there is extra working happening 
for each column value in each row, so could have a possible performance impact. 

> Hive's Parquet SerDe ignores 'serialization.null.format' property
> -
>
> Key: HIVE-12362
> URL: https://issues.apache.org/jira/browse/HIVE-12362
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-12362.2.patch, HIVE-12362.patch
>
>
> {code}
> create table src (a string);
> insert into table src values (NULL), (''), ('');
> 0: jdbc:hive2://localhost:1/default> select * from src;
> +---+--+
> | src.a  |
> +---+--+
> | NULL  |
> ||
> ||
> +---+--+
> create table dest (a string) row format serde 
> 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' stored as 
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> alter table dest set SERDEPROPERTIES ('serialization.null.format' = '');
> alter table dest set TBLPROPERTIES ('serialization.null.format' = '');
> insert overwrite table dest select * from src;
> 0: jdbc:hive2://localhost:1/default> select * from test11;
> +---+--+
> | test11.a  |
> +---+--+
> | NULL  |
> ||
> ||
> +---+--+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12362) Hive's Parquet SerDe ignores 'serialization.null.format' property

2016-08-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15452968#comment-15452968
 ] 

Hive QA commented on HIVE-12362:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12771364/HIVE-12362.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1058/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1058/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-1058/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.8.0_25 ]]
+ export JAVA_HOME=/usr/java/jdk1.8.0_25
+ JAVA_HOME=/usr/java/jdk1.8.0_25
+ export 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-MASTER-Build-1058/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 1f6949f HIVE-14233 - Improve vectorization for ACID by 
eliminating row-by-row stitching (Saket Saurabh via Eugene Koifman)
+ git clean -f -d
Removing common/src/java/org/apache/hadoop/hive/conf/HiveConf.java.orig
Removing common/src/test/org/apache/hadoop/hive/common/TestLogUtils.java
Removing ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java.orig
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 1f6949f HIVE-14233 - Improve vectorization for ACID by 
eliminating row-by-row stitching (Saket Saurabh via Eugene Koifman)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12771364 - PreCommit-HIVE-MASTER-Build

> Hive's Parquet SerDe ignores 'serialization.null.format' property
> -
>
> Key: HIVE-12362
> URL: https://issues.apache.org/jira/browse/HIVE-12362
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-12362.2.patch, HIVE-12362.patch
>
>
> {code}
> create table src (a string);
> insert into table src values (NULL), (''), ('');
> 0: jdbc:hive2://localhost:1/default> select * from src;
> +---+--+
> | src.a  |
> +---+--+
> | NULL  |
> ||
> ||
> +---+--+
> create table dest (a string) row format serde 
> 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' stored as 
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> alter table dest set SERDEPROPERTIES ('serialization.null.format' = '');
> alter table dest set TBLPROPERTIES ('serialization.null.format' = '');
> insert overwrite table dest select * from src;
> 0: jdbc:hive2://localhost:1/default> select * from test11;
> +---+--+
> | test11.a  |
> +---+--+
> | NULL  |
> ||
> ||
> +---+--+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12362) Hive's Parquet SerDe ignores 'serialization.null.format' property

2016-08-31 Thread Lenni Kuff (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15452648#comment-15452648
 ] 

Lenni Kuff commented on HIVE-12362:
---

[~ngangam] - Looking at the patch it appears there may be some significant 
performance impact with this change. Have you done any performance testing with 
this patch? 

> Hive's Parquet SerDe ignores 'serialization.null.format' property
> -
>
> Key: HIVE-12362
> URL: https://issues.apache.org/jira/browse/HIVE-12362
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-12362.2.patch, HIVE-12362.patch
>
>
> {code}
> create table src (a string);
> insert into table src values (NULL), (''), ('');
> 0: jdbc:hive2://localhost:1/default> select * from src;
> +---+--+
> | src.a  |
> +---+--+
> | NULL  |
> ||
> ||
> +---+--+
> create table dest (a string) row format serde 
> 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' stored as 
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> alter table dest set SERDEPROPERTIES ('serialization.null.format' = '');
> alter table dest set TBLPROPERTIES ('serialization.null.format' = '');
> insert overwrite table dest select * from src;
> 0: jdbc:hive2://localhost:1/default> select * from test11;
> +---+--+
> | test11.a  |
> +---+--+
> | NULL  |
> ||
> ||
> +---+--+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12362) Hive's Parquet SerDe ignores 'serialization.null.format' property

2015-11-09 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996531#comment-14996531
 ] 

Naveen Gangam commented on HIVE-12362:
--

The test failures are unrelated to the attached patch.

> Hive's Parquet SerDe ignores 'serialization.null.format' property
> -
>
> Key: HIVE-12362
> URL: https://issues.apache.org/jira/browse/HIVE-12362
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-12362.patch
>
>
> {code}
> create table src (a string);
> insert into table src values (NULL), (''), ('');
> 0: jdbc:hive2://localhost:1/default> select * from src;
> +---+--+
> | src.a  |
> +---+--+
> | NULL  |
> ||
> ||
> +---+--+
> create table dest (a string) row format serde 
> 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' stored as 
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> alter table dest set SERDEPROPERTIES ('serialization.null.format' = '');
> alter table dest set TBLPROPERTIES ('serialization.null.format' = '');
> insert overwrite table dest select * from src;
> 0: jdbc:hive2://localhost:1/default> select * from test11;
> +---+--+
> | test11.a  |
> +---+--+
> | NULL  |
> ||
> ||
> +---+--+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12362) Hive's Parquet SerDe ignores 'serialization.null.format' property

2015-11-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997027#comment-14997027
 ] 

Hive QA commented on HIVE-12362:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12771364/HIVE-12362.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9777 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_null_format
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5972/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5972/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5972/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12771364 - PreCommit-HIVE-TRUNK-Build

> Hive's Parquet SerDe ignores 'serialization.null.format' property
> -
>
> Key: HIVE-12362
> URL: https://issues.apache.org/jira/browse/HIVE-12362
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-12362.2.patch, HIVE-12362.patch
>
>
> {code}
> create table src (a string);
> insert into table src values (NULL), (''), ('');
> 0: jdbc:hive2://localhost:1/default> select * from src;
> +---+--+
> | src.a  |
> +---+--+
> | NULL  |
> ||
> ||
> +---+--+
> create table dest (a string) row format serde 
> 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' stored as 
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> alter table dest set SERDEPROPERTIES ('serialization.null.format' = '');
> alter table dest set TBLPROPERTIES ('serialization.null.format' = '');
> insert overwrite table dest select * from src;
> 0: jdbc:hive2://localhost:1/default> select * from test11;
> +---+--+
> | test11.a  |
> +---+--+
> | NULL  |
> ||
> ||
> +---+--+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12362) Hive's Parquet SerDe ignores 'serialization.null.format' property

2015-11-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14995053#comment-14995053
 ] 

Hive QA commented on HIVE-12362:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12771125/HIVE-12362.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9762 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-auto_sortmerge_join_13.q-tez_self_join.q-orc_vectorization_ppd.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5958/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5958/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5958/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12771125 - PreCommit-HIVE-TRUNK-Build

> Hive's Parquet SerDe ignores 'serialization.null.format' property
> -
>
> Key: HIVE-12362
> URL: https://issues.apache.org/jira/browse/HIVE-12362
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-12362.patch
>
>
> {code}
> create table src (a string);
> insert into table src values (NULL), (''), ('');
> 0: jdbc:hive2://localhost:1/default> select * from src;
> +---+--+
> | src.a  |
> +---+--+
> | NULL  |
> ||
> ||
> +---+--+
> create table dest (a string) row format serde 
> 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' stored as 
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> alter table dest set SERDEPROPERTIES ('serialization.null.format' = '');
> alter table dest set TBLPROPERTIES ('serialization.null.format' = '');
> insert overwrite table dest select * from src;
> 0: jdbc:hive2://localhost:1/default> select * from test11;
> +---+--+
> | test11.a  |
> +---+--+
> | NULL  |
> ||
> ||
> +---+--+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)