[jira] [Commented] (HIVE-6670) ClassNotFound with Serde

2014-03-27 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13949554#comment-13949554
 ] 

Alan Gates commented on HIVE-6670:
--

Ran tests locally, all looks good.

 ClassNotFound with Serde
 

 Key: HIVE-6670
 URL: https://issues.apache.org/jira/browse/HIVE-6670
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Abin Shahab
Assignee: Abin Shahab
 Attachments: HIVE-6670-branch-0.12.patch, HIVE-6670.1.patch, 
 HIVE-6670.patch


 We are finding a ClassNotFound exception when we use 
 CSVSerde(https://github.com/ogrodnek/csv-serde) to create a table.
 This is happening because MapredLocalTask does not pass the local added jars 
 to ExecDriver when that is launched.
 ExecDriver's classpath does not include the added jars. Therefore, when the 
 plan is deserialized, it throws a ClassNotFoundException in the 
 deserialization code, and results in a TableDesc object with a Null 
 DeserializerClass.
 This results in an NPE during Fetch.
 Steps to reproduce:
 wget 
 https://drone.io/github.com/ogrodnek/csv-serde/files/target/csv-serde-1.1.2-0.11.0-all.jar
  into somewhere local eg. 
 /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar.
 Place some sample SCV files in HDFS as follows:
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -put /home/soam/sampleCSV.csv /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleJoinTarget/
 hdfs dfs -put /home/soam/sampleJoinTarget.csv 
 /user/soam/HiveSerdeIssue/sampleJoinTarget/
 
 create the tables in hive:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 create external table sampleCSV (md5hash string, filepath string)
 row format serde 'com.bizo.hive.serde.csv.CSVSerde'
 stored as textfile
 location '/user/soam/HiveSerdeIssue/sampleCSV/'
 ;
 create external table sampleJoinTarget (md5hash string, filepath string, 
 datestamp string, nblines string, nberrors string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY ',' 
 LINES TERMINATED BY '\n'
 STORED AS TEXTFILE
 LOCATION '/user/soam/HiveSerdeIssue/sampleJoinTarget/'
 ;
 ===
 Now, try the following JOIN:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 —
 This will fail with the error:
 Execution log at: /tmp/soam/.log
 java.lang.ClassNotFoundException: com/bizo/hive/serde/csv/CSVSerde
 Continuing ...
 2014-03-11 10:35:03 Starting to launch local task to process map join; 
 maximum memory = 238551040
 Execution failed with exit status: 2
 Obtaining error information
 Task failed!
 Task ID:
 Stage-4
 Logs:
 /var/log/hive/soam/hive.log
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
 Try the following LEFT JOIN. This will work:
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 LEFT JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 ==



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6670) ClassNotFound with Serde

2014-03-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13949968#comment-13949968
 ] 

Hive QA commented on HIVE-6670:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12637036/HIVE-6670.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5492 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_mapreduce_stack_trace_hadoop20
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1987/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1987/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12637036

 ClassNotFound with Serde
 

 Key: HIVE-6670
 URL: https://issues.apache.org/jira/browse/HIVE-6670
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Abin Shahab
Assignee: Abin Shahab
 Attachments: HIVE-6670-branch-0.12.patch, HIVE-6670.1.patch, 
 HIVE-6670.patch


 We are finding a ClassNotFound exception when we use 
 CSVSerde(https://github.com/ogrodnek/csv-serde) to create a table.
 This is happening because MapredLocalTask does not pass the local added jars 
 to ExecDriver when that is launched.
 ExecDriver's classpath does not include the added jars. Therefore, when the 
 plan is deserialized, it throws a ClassNotFoundException in the 
 deserialization code, and results in a TableDesc object with a Null 
 DeserializerClass.
 This results in an NPE during Fetch.
 Steps to reproduce:
 wget 
 https://drone.io/github.com/ogrodnek/csv-serde/files/target/csv-serde-1.1.2-0.11.0-all.jar
  into somewhere local eg. 
 /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar.
 Place some sample SCV files in HDFS as follows:
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -put /home/soam/sampleCSV.csv /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleJoinTarget/
 hdfs dfs -put /home/soam/sampleJoinTarget.csv 
 /user/soam/HiveSerdeIssue/sampleJoinTarget/
 
 create the tables in hive:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 create external table sampleCSV (md5hash string, filepath string)
 row format serde 'com.bizo.hive.serde.csv.CSVSerde'
 stored as textfile
 location '/user/soam/HiveSerdeIssue/sampleCSV/'
 ;
 create external table sampleJoinTarget (md5hash string, filepath string, 
 datestamp string, nblines string, nberrors string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY ',' 
 LINES TERMINATED BY '\n'
 STORED AS TEXTFILE
 LOCATION '/user/soam/HiveSerdeIssue/sampleJoinTarget/'
 ;
 ===
 Now, try the following JOIN:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 —
 This will fail with the error:
 Execution log at: /tmp/soam/.log
 java.lang.ClassNotFoundException: com/bizo/hive/serde/csv/CSVSerde
 Continuing ...
 2014-03-11 10:35:03 Starting to launch local task to process map join; 
 maximum memory = 238551040
 Execution failed with exit status: 2
 Obtaining error information
 Task failed!
 Task ID:
 Stage-4
 Logs:
 /var/log/hive/soam/hive.log
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
 Try the following LEFT JOIN. This will work:
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 LEFT JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 ==



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6670) ClassNotFound with Serde

2014-03-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947964#comment-13947964
 ] 

Hive QA commented on HIVE-6670:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12636760/HIVE-6670.patch

{color:green}SUCCESS:{color} +1 5457 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1964/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1964/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12636760

 ClassNotFound with Serde
 

 Key: HIVE-6670
 URL: https://issues.apache.org/jira/browse/HIVE-6670
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Abin Shahab
Assignee: Abin Shahab
 Attachments: HIVE-6670-branch-0.12.patch, HIVE-6670.patch


 We are finding a ClassNotFound exception when we use 
 CSVSerde(https://github.com/ogrodnek/csv-serde) to create a table.
 This is happening because MapredLocalTask does not pass the local added jars 
 to ExecDriver when that is launched.
 ExecDriver's classpath does not include the added jars. Therefore, when the 
 plan is deserialized, it throws a ClassNotFoundException in the 
 deserialization code, and results in a TableDesc object with a Null 
 DeserializerClass.
 This results in an NPE during Fetch.
 Steps to reproduce:
 wget 
 https://drone.io/github.com/ogrodnek/csv-serde/files/target/csv-serde-1.1.2-0.11.0-all.jar
  into somewhere local eg. 
 /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar.
 Place some sample SCV files in HDFS as follows:
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -put /home/soam/sampleCSV.csv /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleJoinTarget/
 hdfs dfs -put /home/soam/sampleJoinTarget.csv 
 /user/soam/HiveSerdeIssue/sampleJoinTarget/
 
 create the tables in hive:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 create external table sampleCSV (md5hash string, filepath string)
 row format serde 'com.bizo.hive.serde.csv.CSVSerde'
 stored as textfile
 location '/user/soam/HiveSerdeIssue/sampleCSV/'
 ;
 create external table sampleJoinTarget (md5hash string, filepath string, 
 datestamp string, nblines string, nberrors string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY ',' 
 LINES TERMINATED BY '\n'
 STORED AS TEXTFILE
 LOCATION '/user/soam/HiveSerdeIssue/sampleJoinTarget/'
 ;
 ===
 Now, try the following JOIN:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 —
 This will fail with the error:
 Execution log at: /tmp/soam/.log
 java.lang.ClassNotFoundException: com/bizo/hive/serde/csv/CSVSerde
 Continuing ...
 2014-03-11 10:35:03 Starting to launch local task to process map join; 
 maximum memory = 238551040
 Execution failed with exit status: 2
 Obtaining error information
 Task failed!
 Task ID:
 Stage-4
 Logs:
 /var/log/hive/soam/hive.log
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
 Try the following LEFT JOIN. This will work:
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 LEFT JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 ==



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6670) ClassNotFound with Serde

2014-03-26 Thread Abin Shahab (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948620#comment-13948620
 ] 

Abin Shahab commented on HIVE-6670:
---

Thanks for rolling it forward!



 ClassNotFound with Serde
 

 Key: HIVE-6670
 URL: https://issues.apache.org/jira/browse/HIVE-6670
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Abin Shahab
Assignee: Abin Shahab
 Attachments: HIVE-6670-branch-0.12.patch, HIVE-6670.1.patch, 
 HIVE-6670.patch


 We are finding a ClassNotFound exception when we use 
 CSVSerde(https://github.com/ogrodnek/csv-serde) to create a table.
 This is happening because MapredLocalTask does not pass the local added jars 
 to ExecDriver when that is launched.
 ExecDriver's classpath does not include the added jars. Therefore, when the 
 plan is deserialized, it throws a ClassNotFoundException in the 
 deserialization code, and results in a TableDesc object with a Null 
 DeserializerClass.
 This results in an NPE during Fetch.
 Steps to reproduce:
 wget 
 https://drone.io/github.com/ogrodnek/csv-serde/files/target/csv-serde-1.1.2-0.11.0-all.jar
  into somewhere local eg. 
 /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar.
 Place some sample SCV files in HDFS as follows:
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -put /home/soam/sampleCSV.csv /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleJoinTarget/
 hdfs dfs -put /home/soam/sampleJoinTarget.csv 
 /user/soam/HiveSerdeIssue/sampleJoinTarget/
 
 create the tables in hive:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 create external table sampleCSV (md5hash string, filepath string)
 row format serde 'com.bizo.hive.serde.csv.CSVSerde'
 stored as textfile
 location '/user/soam/HiveSerdeIssue/sampleCSV/'
 ;
 create external table sampleJoinTarget (md5hash string, filepath string, 
 datestamp string, nblines string, nberrors string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY ',' 
 LINES TERMINATED BY '\n'
 STORED AS TEXTFILE
 LOCATION '/user/soam/HiveSerdeIssue/sampleJoinTarget/'
 ;
 ===
 Now, try the following JOIN:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 —
 This will fail with the error:
 Execution log at: /tmp/soam/.log
 java.lang.ClassNotFoundException: com/bizo/hive/serde/csv/CSVSerde
 Continuing ...
 2014-03-11 10:35:03 Starting to launch local task to process map join; 
 maximum memory = 238551040
 Execution failed with exit status: 2
 Obtaining error information
 Task failed!
 Task ID:
 Stage-4
 Logs:
 /var/log/hive/soam/hive.log
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
 Try the following LEFT JOIN. This will work:
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 LEFT JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 ==



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6670) ClassNotFound with Serde

2014-03-26 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948677#comment-13948677
 ] 

Jason Dere commented on HIVE-6670:
--

+1

 ClassNotFound with Serde
 

 Key: HIVE-6670
 URL: https://issues.apache.org/jira/browse/HIVE-6670
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Abin Shahab
Assignee: Abin Shahab
 Attachments: HIVE-6670-branch-0.12.patch, HIVE-6670.1.patch, 
 HIVE-6670.patch


 We are finding a ClassNotFound exception when we use 
 CSVSerde(https://github.com/ogrodnek/csv-serde) to create a table.
 This is happening because MapredLocalTask does not pass the local added jars 
 to ExecDriver when that is launched.
 ExecDriver's classpath does not include the added jars. Therefore, when the 
 plan is deserialized, it throws a ClassNotFoundException in the 
 deserialization code, and results in a TableDesc object with a Null 
 DeserializerClass.
 This results in an NPE during Fetch.
 Steps to reproduce:
 wget 
 https://drone.io/github.com/ogrodnek/csv-serde/files/target/csv-serde-1.1.2-0.11.0-all.jar
  into somewhere local eg. 
 /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar.
 Place some sample SCV files in HDFS as follows:
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -put /home/soam/sampleCSV.csv /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleJoinTarget/
 hdfs dfs -put /home/soam/sampleJoinTarget.csv 
 /user/soam/HiveSerdeIssue/sampleJoinTarget/
 
 create the tables in hive:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 create external table sampleCSV (md5hash string, filepath string)
 row format serde 'com.bizo.hive.serde.csv.CSVSerde'
 stored as textfile
 location '/user/soam/HiveSerdeIssue/sampleCSV/'
 ;
 create external table sampleJoinTarget (md5hash string, filepath string, 
 datestamp string, nblines string, nberrors string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY ',' 
 LINES TERMINATED BY '\n'
 STORED AS TEXTFILE
 LOCATION '/user/soam/HiveSerdeIssue/sampleJoinTarget/'
 ;
 ===
 Now, try the following JOIN:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 —
 This will fail with the error:
 Execution log at: /tmp/soam/.log
 java.lang.ClassNotFoundException: com/bizo/hive/serde/csv/CSVSerde
 Continuing ...
 2014-03-11 10:35:03 Starting to launch local task to process map join; 
 maximum memory = 238551040
 Execution failed with exit status: 2
 Obtaining error information
 Task failed!
 Task ID:
 Stage-4
 Logs:
 /var/log/hive/soam/hive.log
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
 Try the following LEFT JOIN. This will work:
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 LEFT JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 ==



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6670) ClassNotFound with Serde

2014-03-25 Thread Abin Shahab (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947077#comment-13947077
 ] 

Abin Shahab commented on HIVE-6670:
---

[~hashutosh] I can write a test case. Is there a similar testcase that I can 
look at?
I'm not sure how to create a ReviewBoard entry. It'd be great if you can do 
that once I upload the test.


 ClassNotFound with Serde
 

 Key: HIVE-6670
 URL: https://issues.apache.org/jira/browse/HIVE-6670
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Abin Shahab
Assignee: Abin Shahab
 Attachments: HIVE-6670-branch-0.12.patch, HIVE-6670.patch


 We are finding a ClassNotFound exception when we use 
 CSVSerde(https://github.com/ogrodnek/csv-serde) to create a table.
 This is happening because MapredLocalTask does not pass the local added jars 
 to ExecDriver when that is launched.
 ExecDriver's classpath does not include the added jars. Therefore, when the 
 plan is deserialized, it throws a ClassNotFoundException in the 
 deserialization code, and results in a TableDesc object with a Null 
 DeserializerClass.
 This results in an NPE during Fetch.
 Steps to reproduce:
 wget 
 https://drone.io/github.com/ogrodnek/csv-serde/files/target/csv-serde-1.1.2-0.11.0-all.jar
  into somewhere local eg. 
 /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar.
 Place some sample SCV files in HDFS as follows:
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -put /home/soam/sampleCSV.csv /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleJoinTarget/
 hdfs dfs -put /home/soam/sampleJoinTarget.csv 
 /user/soam/HiveSerdeIssue/sampleJoinTarget/
 
 create the tables in hive:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 create external table sampleCSV (md5hash string, filepath string)
 row format serde 'com.bizo.hive.serde.csv.CSVSerde'
 stored as textfile
 location '/user/soam/HiveSerdeIssue/sampleCSV/'
 ;
 create external table sampleJoinTarget (md5hash string, filepath string, 
 datestamp string, nblines string, nberrors string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY ',' 
 LINES TERMINATED BY '\n'
 STORED AS TEXTFILE
 LOCATION '/user/soam/HiveSerdeIssue/sampleJoinTarget/'
 ;
 ===
 Now, try the following JOIN:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 —
 This will fail with the error:
 Execution log at: /tmp/soam/.log
 java.lang.ClassNotFoundException: com/bizo/hive/serde/csv/CSVSerde
 Continuing ...
 2014-03-11 10:35:03 Starting to launch local task to process map join; 
 maximum memory = 238551040
 Execution failed with exit status: 2
 Obtaining error information
 Task failed!
 Task ID:
 Stage-4
 Logs:
 /var/log/hive/soam/hive.log
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
 Try the following LEFT JOIN. This will work:
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 LEFT JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 ==



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6670) ClassNotFound with Serde

2014-03-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947287#comment-13947287
 ] 

Ashutosh Chauhan commented on HIVE-6670:


Query you posted in description is a good testcase. Just add it in as .q file 
in ql/src/test/queries/clientpositive/ where all other test queries are. More 
info at [wiki site | 
https://cwiki.apache.org/confluence/display/Hive/HiveDeveloperFAQ#HiveDeveloperFAQ]
 You can create review request on [review board | 
https://reviews.apache.org/r/new/] 

 ClassNotFound with Serde
 

 Key: HIVE-6670
 URL: https://issues.apache.org/jira/browse/HIVE-6670
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Abin Shahab
Assignee: Abin Shahab
 Attachments: HIVE-6670-branch-0.12.patch, HIVE-6670.patch


 We are finding a ClassNotFound exception when we use 
 CSVSerde(https://github.com/ogrodnek/csv-serde) to create a table.
 This is happening because MapredLocalTask does not pass the local added jars 
 to ExecDriver when that is launched.
 ExecDriver's classpath does not include the added jars. Therefore, when the 
 plan is deserialized, it throws a ClassNotFoundException in the 
 deserialization code, and results in a TableDesc object with a Null 
 DeserializerClass.
 This results in an NPE during Fetch.
 Steps to reproduce:
 wget 
 https://drone.io/github.com/ogrodnek/csv-serde/files/target/csv-serde-1.1.2-0.11.0-all.jar
  into somewhere local eg. 
 /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar.
 Place some sample SCV files in HDFS as follows:
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -put /home/soam/sampleCSV.csv /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleJoinTarget/
 hdfs dfs -put /home/soam/sampleJoinTarget.csv 
 /user/soam/HiveSerdeIssue/sampleJoinTarget/
 
 create the tables in hive:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 create external table sampleCSV (md5hash string, filepath string)
 row format serde 'com.bizo.hive.serde.csv.CSVSerde'
 stored as textfile
 location '/user/soam/HiveSerdeIssue/sampleCSV/'
 ;
 create external table sampleJoinTarget (md5hash string, filepath string, 
 datestamp string, nblines string, nberrors string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY ',' 
 LINES TERMINATED BY '\n'
 STORED AS TEXTFILE
 LOCATION '/user/soam/HiveSerdeIssue/sampleJoinTarget/'
 ;
 ===
 Now, try the following JOIN:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 —
 This will fail with the error:
 Execution log at: /tmp/soam/.log
 java.lang.ClassNotFoundException: com/bizo/hive/serde/csv/CSVSerde
 Continuing ...
 2014-03-11 10:35:03 Starting to launch local task to process map join; 
 maximum memory = 238551040
 Execution failed with exit status: 2
 Obtaining error information
 Task failed!
 Task ID:
 Stage-4
 Logs:
 /var/log/hive/soam/hive.log
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
 Try the following LEFT JOIN. This will work:
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 LEFT JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 ==



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6670) ClassNotFound with Serde

2014-03-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947359#comment-13947359
 ] 

Ashutosh Chauhan commented on HIVE-6670:


I tested manually and I am able to repro. Also, with patch it succeeds. Thats, 
good. 
However, I think instead of passing on cmd line, better to pass it via Conf 
object using {{hive.added.jars.path}} variable , the way its done in 
MapRedTask. That way its consistent across two types of task.

 ClassNotFound with Serde
 

 Key: HIVE-6670
 URL: https://issues.apache.org/jira/browse/HIVE-6670
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Abin Shahab
Assignee: Abin Shahab
 Attachments: HIVE-6670-branch-0.12.patch, HIVE-6670.patch


 We are finding a ClassNotFound exception when we use 
 CSVSerde(https://github.com/ogrodnek/csv-serde) to create a table.
 This is happening because MapredLocalTask does not pass the local added jars 
 to ExecDriver when that is launched.
 ExecDriver's classpath does not include the added jars. Therefore, when the 
 plan is deserialized, it throws a ClassNotFoundException in the 
 deserialization code, and results in a TableDesc object with a Null 
 DeserializerClass.
 This results in an NPE during Fetch.
 Steps to reproduce:
 wget 
 https://drone.io/github.com/ogrodnek/csv-serde/files/target/csv-serde-1.1.2-0.11.0-all.jar
  into somewhere local eg. 
 /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar.
 Place some sample SCV files in HDFS as follows:
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -put /home/soam/sampleCSV.csv /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleJoinTarget/
 hdfs dfs -put /home/soam/sampleJoinTarget.csv 
 /user/soam/HiveSerdeIssue/sampleJoinTarget/
 
 create the tables in hive:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 create external table sampleCSV (md5hash string, filepath string)
 row format serde 'com.bizo.hive.serde.csv.CSVSerde'
 stored as textfile
 location '/user/soam/HiveSerdeIssue/sampleCSV/'
 ;
 create external table sampleJoinTarget (md5hash string, filepath string, 
 datestamp string, nblines string, nberrors string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY ',' 
 LINES TERMINATED BY '\n'
 STORED AS TEXTFILE
 LOCATION '/user/soam/HiveSerdeIssue/sampleJoinTarget/'
 ;
 ===
 Now, try the following JOIN:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 —
 This will fail with the error:
 Execution log at: /tmp/soam/.log
 java.lang.ClassNotFoundException: com/bizo/hive/serde/csv/CSVSerde
 Continuing ...
 2014-03-11 10:35:03 Starting to launch local task to process map join; 
 maximum memory = 238551040
 Execution failed with exit status: 2
 Obtaining error information
 Task failed!
 Task ID:
 Stage-4
 Logs:
 /var/log/hive/soam/hive.log
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
 Try the following LEFT JOIN. This will work:
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 LEFT JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 ==



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6670) ClassNotFound with Serde

2014-03-25 Thread Abin Shahab (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947420#comment-13947420
 ] 

Abin Shahab commented on HIVE-6670:
---

But I don't want to overwrite existing added jars.



 ClassNotFound with Serde
 

 Key: HIVE-6670
 URL: https://issues.apache.org/jira/browse/HIVE-6670
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Abin Shahab
Assignee: Abin Shahab
 Attachments: HIVE-6670-branch-0.12.patch, HIVE-6670.patch


 We are finding a ClassNotFound exception when we use 
 CSVSerde(https://github.com/ogrodnek/csv-serde) to create a table.
 This is happening because MapredLocalTask does not pass the local added jars 
 to ExecDriver when that is launched.
 ExecDriver's classpath does not include the added jars. Therefore, when the 
 plan is deserialized, it throws a ClassNotFoundException in the 
 deserialization code, and results in a TableDesc object with a Null 
 DeserializerClass.
 This results in an NPE during Fetch.
 Steps to reproduce:
 wget 
 https://drone.io/github.com/ogrodnek/csv-serde/files/target/csv-serde-1.1.2-0.11.0-all.jar
  into somewhere local eg. 
 /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar.
 Place some sample SCV files in HDFS as follows:
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -put /home/soam/sampleCSV.csv /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleJoinTarget/
 hdfs dfs -put /home/soam/sampleJoinTarget.csv 
 /user/soam/HiveSerdeIssue/sampleJoinTarget/
 
 create the tables in hive:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 create external table sampleCSV (md5hash string, filepath string)
 row format serde 'com.bizo.hive.serde.csv.CSVSerde'
 stored as textfile
 location '/user/soam/HiveSerdeIssue/sampleCSV/'
 ;
 create external table sampleJoinTarget (md5hash string, filepath string, 
 datestamp string, nblines string, nberrors string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY ',' 
 LINES TERMINATED BY '\n'
 STORED AS TEXTFILE
 LOCATION '/user/soam/HiveSerdeIssue/sampleJoinTarget/'
 ;
 ===
 Now, try the following JOIN:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 —
 This will fail with the error:
 Execution log at: /tmp/soam/.log
 java.lang.ClassNotFoundException: com/bizo/hive/serde/csv/CSVSerde
 Continuing ...
 2014-03-11 10:35:03 Starting to launch local task to process map join; 
 maximum memory = 238551040
 Execution failed with exit status: 2
 Obtaining error information
 Task failed!
 Task ID:
 Stage-4
 Logs:
 /var/log/hive/soam/hive.log
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
 Try the following LEFT JOIN. This will work:
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 LEFT JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 ==



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6670) ClassNotFound with Serde

2014-03-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947432#comment-13947432
 ] 

Ashutosh Chauhan commented on HIVE-6670:


You need not to. You can append.

 ClassNotFound with Serde
 

 Key: HIVE-6670
 URL: https://issues.apache.org/jira/browse/HIVE-6670
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Abin Shahab
Assignee: Abin Shahab
 Attachments: HIVE-6670-branch-0.12.patch, HIVE-6670.patch


 We are finding a ClassNotFound exception when we use 
 CSVSerde(https://github.com/ogrodnek/csv-serde) to create a table.
 This is happening because MapredLocalTask does not pass the local added jars 
 to ExecDriver when that is launched.
 ExecDriver's classpath does not include the added jars. Therefore, when the 
 plan is deserialized, it throws a ClassNotFoundException in the 
 deserialization code, and results in a TableDesc object with a Null 
 DeserializerClass.
 This results in an NPE during Fetch.
 Steps to reproduce:
 wget 
 https://drone.io/github.com/ogrodnek/csv-serde/files/target/csv-serde-1.1.2-0.11.0-all.jar
  into somewhere local eg. 
 /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar.
 Place some sample SCV files in HDFS as follows:
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -put /home/soam/sampleCSV.csv /user/soam/HiveSerdeIssue/sampleCSV/
 hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleJoinTarget/
 hdfs dfs -put /home/soam/sampleJoinTarget.csv 
 /user/soam/HiveSerdeIssue/sampleJoinTarget/
 
 create the tables in hive:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 create external table sampleCSV (md5hash string, filepath string)
 row format serde 'com.bizo.hive.serde.csv.CSVSerde'
 stored as textfile
 location '/user/soam/HiveSerdeIssue/sampleCSV/'
 ;
 create external table sampleJoinTarget (md5hash string, filepath string, 
 datestamp string, nblines string, nberrors string)
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY ',' 
 LINES TERMINATED BY '\n'
 STORED AS TEXTFILE
 LOCATION '/user/soam/HiveSerdeIssue/sampleJoinTarget/'
 ;
 ===
 Now, try the following JOIN:
 ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 —
 This will fail with the error:
 Execution log at: /tmp/soam/.log
 java.lang.ClassNotFoundException: com/bizo/hive/serde/csv/CSVSerde
 Continuing ...
 2014-03-11 10:35:03 Starting to launch local task to process map join; 
 maximum memory = 238551040
 Execution failed with exit status: 2
 Obtaining error information
 Task failed!
 Task ID:
 Stage-4
 Logs:
 /var/log/hive/soam/hive.log
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
 Try the following LEFT JOIN. This will work:
 SELECT 
 sampleCSV.md5hash, 
 sampleCSV.filepath 
 FROM sampleCSV
 LEFT JOIN sampleJoinTarget
 ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
 ;
 ==



--
This message was sent by Atlassian JIRA
(v6.2#6252)