Re: Spark Hive max key length is 767 bytes

2014-09-25 Thread Denny Lee
Sorry for missing your original email - thanks for the catch, eh?!

On Thu, Sep 25, 2014 at 7:14 AM, arthur.hk.c...@gmail.com 
arthur.hk.c...@gmail.com wrote:

 Hi,

 Fixed the issue by downgrade hive from 13.1 to 12.0, it works well now.

 Regards


 On 31 Aug, 2014, at 7:28 am, arthur.hk.c...@gmail.com 
 arthur.hk.c...@gmail.com wrote:

 Hi,

 Already done but still get the same error:

 (I use HIVE 0.13.1 Spark 1.0.2, Hadoop 2.4.1)

 Steps:
 Step 1) mysql:

 alter database hive character set latin1;

 Step 2) HIVE:

 hive create table test_datatype2 (testbigint bigint );
 OK
 Time taken: 0.708 seconds

 hive drop table test_datatype2;
 OK
 Time taken: 23.272 seconds

 Step 3) scala val hiveContext = new
 org.apache.spark.sql.hive.HiveContext(sc)

 14/08/29 19:33:52 INFO Configuration.deprecation:
 mapred.reduce.tasks.speculative.execution is deprecated. Instead, use
 mapreduce.reduce.speculative
 hiveContext: org.apache.spark.sql.hive.HiveContext =
 org.apache.spark.sql.hive.HiveContext@395c7b94

 scala hiveContext.hql(“create table test_datatype3 (testbigint bigint)”)

 res0: org.apache.spark.sql.SchemaRDD =
 SchemaRDD[0] at RDD at SchemaRDD.scala:104
 == Query Plan ==
 Native command: executed by Hive

 scala hiveContext.hql(drop table test_datatype3)

 14/08/29 19:34:14 ERROR DataNucleus.Datastore: An exception was thrown
 while adding/validating class(es) : Specified key was too long; max key
 length is 767 bytes
 com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key
 was too long; max key length is 767 bytes
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

 14/08/29 19:34:17 WARN DataNucleus.Query: Query for candidates of
 org.apache.hadoop.hive.metastore.model.MPartition and subclasses resulted
 in no possible candidates
 Error(s) were found while auto-creating/validating the datastore for
 classes. The errors are printed in the log, and are attached to this
 exception.
 org.datanucleus.exceptions.NucleusDataStoreException: Error(s) were found
 while auto-creating/validating the datastore for classes. The errors are
 printed in the log, and are attached to this exception.
 at
 org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.verifyErrors(RDBMSStoreManager.java:3609)


 Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException:
 Specified key was too long; max key length is 767 bytes
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)



 Should I use HIVE 0.12.0 instead of HIVE 0.13.1?

 Regards
 Arthur

 On 31 Aug, 2014, at 6:01 am, Denny Lee denny.g@gmail.com wrote:

 Oh, you may be running into an issue with your MySQL setup actually, try
 running

 alter database metastore_db character set latin1

 so that way Hive (and the Spark HiveContext) can execute properly against
 the metastore.


 On August 29, 2014 at 04:39:01, arthur.hk.c...@gmail.com (
 arthur.hk.c...@gmail.com) wrote:

 Hi,


 Tried the same thing in HIVE directly without issue:

 HIVE:

 hive create table test_datatype2 (testbigint bigint );
 OK
 Time taken: 0.708 seconds

 hive drop table test_datatype2;
 OK
 Time taken: 23.272 seconds




 Then tried again in SPARK:
 scala val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
 14/08/29 19:33:52 INFO Configuration.deprecation:
 mapred.reduce.tasks.speculative.execution is deprecated. Instead, use
 mapreduce.reduce.speculative
 hiveContext: org.apache.spark.sql.hive.HiveContext =
 org.apache.spark.sql.hive.HiveContext@395c7b94

 scala val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
 res0: org.apache.spark.sql.SchemaRDD =
 SchemaRDD[0] at RDD at SchemaRDD.scala:104
 == Query Plan ==
 Native command: executed by Hive

 scala hiveContext.hql(drop table test_datatype3)

 14/08/29 19:34:14 ERROR DataNucleus.Datastore: An exception was thrown
 while adding/validating class(es) : Specified key was too long; max key
 length is 767 bytes
 com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key
 was too long; max key length is 767 bytes
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

 14/08/29 19:34:17 WARN DataNucleus.Query: Query for candidates of
 org.apache.hadoop.hive.metastore.model.MPartition and subclasses resulted
 in no possible candidates
 Error(s) were found while auto-creating/validating the datastore for
 classes. The errors are printed in the log, and are attached to this
 exception.
 org.datanucleus.exceptions.NucleusDataStoreException: Error(s) were found
 while auto-creating/validating the datastore for classes. The errors are
 printed in the log, and are attached to this exception.
 at
 org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.verifyErrors(RDBMSStoreManager.java:3609)


 Caused by: 

Re: Spark Hive max key length is 767 bytes

2014-08-30 Thread arthur.hk.c...@gmail.com
Hi Michael,

Thank you so much!!

I have tried to change the following key length from 256 to 255 and from 767 to 
766, it still didn’t work
alter table COLUMNS_V2 modify column COMMENT VARCHAR(255);
alter table INDEX_PARAMS modify column PARAM_KEY VARCHAR(255);
alter table SD_PARAMS modify column PARAM_KEY VARCHAR(255);
alter table SERDE_PARAMS modify column PARAM_KEY VARCHAR(255);
alter table TABLE_PARAMS modify column PARAM_KEY VARCHAR(255);
alter table TBLS modify column OWNER VARCHAR(766);
alter table PART_COL_STATS modify column PARTITION_NAME VARCHAR(766);
alter table PARTITION_KEYS modify column PKEY_TYPE VARCHAR(766);
alter table PARTITIONS modify column PART_NAME VARCHAR(766);

I use Hadoop 2.4.1 HBase 0.98.5 Hive 0.13, trying Spark 1.0.2 and Shark 0.9.2, 
and JDK1.6_45.

Some questions:
shark-0.9.2 is based on which Hive version?  is HBase 0.98.x OK? is Hive 0.13.1 
OK? and which Java?  (I use JDK1.6 at the moment, it seems not working)
spark-1.0.2 is based on which Hive version?  is HBase 0.98.x OK?  

Regards
Arthur 


On 30 Aug, 2014, at 1:40 am, Michael Armbrust mich...@databricks.com wrote:

 Spark SQL is based on Hive 12.  They must have changed the maximum key size 
 between 12 and 13.
 
 
 On Fri, Aug 29, 2014 at 4:38 AM, arthur.hk.c...@gmail.com 
 arthur.hk.c...@gmail.com wrote:
 Hi,
 
 
 Tried the same thing in HIVE directly without issue:
 
 HIVE:
 hive create table test_datatype2 (testbigint bigint );
 OK
 Time taken: 0.708 seconds
 
 hive drop table test_datatype2;
 OK
 Time taken: 23.272 seconds
 
 
 
 Then tried again in SPARK:
 scala val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
 14/08/29 19:33:52 INFO Configuration.deprecation: 
 mapred.reduce.tasks.speculative.execution is deprecated. Instead, use 
 mapreduce.reduce.speculative
 hiveContext: org.apache.spark.sql.hive.HiveContext = 
 org.apache.spark.sql.hive.HiveContext@395c7b94
 
 scala val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
 res0: org.apache.spark.sql.SchemaRDD = 
 SchemaRDD[0] at RDD at SchemaRDD.scala:104
 == Query Plan ==
 Native command: executed by Hive
 
 scala hiveContext.hql(drop table test_datatype3)
 
 14/08/29 19:34:14 ERROR DataNucleus.Datastore: An exception was thrown while 
 adding/validating class(es) : Specified key was too long; max key length is 
 767 bytes
 com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was 
 too long; max key length is 767 bytes
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 
 14/08/29 19:34:17 WARN DataNucleus.Query: Query for candidates of 
 org.apache.hadoop.hive.metastore.model.MPartition and subclasses resulted in 
 no possible candidates
 Error(s) were found while auto-creating/validating the datastore for classes. 
 The errors are printed in the log, and are attached to this exception.
 org.datanucleus.exceptions.NucleusDataStoreException: Error(s) were found 
 while auto-creating/validating the datastore for classes. The errors are 
 printed in the log, and are attached to this exception.
   at 
 org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.verifyErrors(RDBMSStoreManager.java:3609)
 
 
 Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: 
 Specified key was too long; max key length is 767 bytes
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 
 14/08/29 19:34:17 INFO DataNucleus.Datastore: The class 
 org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as 
 embedded-only so does not have its own datastore table.
 14/08/29 19:34:17 INFO DataNucleus.Datastore: The class 
 org.apache.hadoop.hive.metastore.model.MOrder is tagged as embedded-only 
 so does not have its own datastore table.
 14/08/29 19:34:17 INFO DataNucleus.Datastore: The class 
 org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as 
 embedded-only so does not have its own datastore table.
 14/08/29 19:34:17 INFO DataNucleus.Datastore: The class 
 org.apache.hadoop.hive.metastore.model.MOrder is tagged as embedded-only 
 so does not have its own datastore table.
 14/08/29 19:34:17 INFO DataNucleus.Datastore: The class 
 org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as 
 embedded-only so does not have its own datastore table.
 14/08/29 19:34:17 INFO DataNucleus.Datastore: The class 
 org.apache.hadoop.hive.metastore.model.MOrder is tagged as embedded-only 
 so does not have its own datastore table.
 14/08/29 19:34:25 ERROR DataNucleus.Datastore: An exception was thrown while 
 adding/validating class(es) : Specified key was too long; max key length is 
 767 bytes
 com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was 
 too long; max key length is 767 bytes
   at 

Re: Spark Hive max key length is 767 bytes

2014-08-30 Thread Denny Lee
Oh, you may be running into an issue with your MySQL setup actually, try running

alter database metastore_db character set latin1

so that way Hive (and the Spark HiveContext) can execute properly against the 
metastore.


On August 29, 2014 at 04:39:01, arthur.hk.c...@gmail.com 
(arthur.hk.c...@gmail.com) wrote:

Hi,


Tried the same thing in HIVE directly without issue:

HIVE:
hive create table test_datatype2 (testbigint bigint );
OK
Time taken: 0.708 seconds

hive drop table test_datatype2;
OK
Time taken: 23.272 seconds



Then tried again in SPARK:
scala val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
14/08/29 19:33:52 INFO Configuration.deprecation: 
mapred.reduce.tasks.speculative.execution is deprecated. Instead, use 
mapreduce.reduce.speculative
hiveContext: org.apache.spark.sql.hive.HiveContext = 
org.apache.spark.sql.hive.HiveContext@395c7b94

scala val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
res0: org.apache.spark.sql.SchemaRDD = 
SchemaRDD[0] at RDD at SchemaRDD.scala:104
== Query Plan ==
Native command: executed by Hive

scala hiveContext.hql(drop table test_datatype3)

14/08/29 19:34:14 ERROR DataNucleus.Datastore: An exception was thrown while 
adding/validating class(es) : Specified key was too long; max key length is 767 
bytes
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was 
too long; max key length is 767 bytes
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

14/08/29 19:34:17 WARN DataNucleus.Query: Query for candidates of 
org.apache.hadoop.hive.metastore.model.MPartition and subclasses resulted in no 
possible candidates
Error(s) were found while auto-creating/validating the datastore for classes. 
The errors are printed in the log, and are attached to this exception.
org.datanucleus.exceptions.NucleusDataStoreException: Error(s) were found while 
auto-creating/validating the datastore for classes. The errors are printed in 
the log, and are attached to this exception.
at 
org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.verifyErrors(RDBMSStoreManager.java:3609)


Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified 
key was too long; max key length is 767 bytes
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

14/08/29 19:34:17 INFO DataNucleus.Datastore: The class 
org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as 
embedded-only so does not have its own datastore table.
14/08/29 19:34:17 INFO DataNucleus.Datastore: The class 
org.apache.hadoop.hive.metastore.model.MOrder is tagged as embedded-only so 
does not have its own datastore table.
14/08/29 19:34:17 INFO DataNucleus.Datastore: The class 
org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as 
embedded-only so does not have its own datastore table.
14/08/29 19:34:17 INFO DataNucleus.Datastore: The class 
org.apache.hadoop.hive.metastore.model.MOrder is tagged as embedded-only so 
does not have its own datastore table.
14/08/29 19:34:17 INFO DataNucleus.Datastore: The class 
org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as 
embedded-only so does not have its own datastore table.
14/08/29 19:34:17 INFO DataNucleus.Datastore: The class 
org.apache.hadoop.hive.metastore.model.MOrder is tagged as embedded-only so 
does not have its own datastore table.
14/08/29 19:34:25 ERROR DataNucleus.Datastore: An exception was thrown while 
adding/validating class(es) : Specified key was too long; max key length is 767 
bytes
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was 
too long; max key length is 767 bytes
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)


Can anyone please help?

Regards
Arthur


On 29 Aug, 2014, at 12:47 pm, arthur.hk.c...@gmail.com 
arthur.hk.c...@gmail.com wrote:

(Please ignore if duplicated) 


Hi,

I use Spark 1.0.2 with Hive 0.13.1

I have already set the hive mysql database to latine1; 

mysql:
alter database hive character set latin1;

Spark:
scala val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
scala hiveContext.hql(create table test_datatype1 (testbigint bigint ))
scala hiveContext.hql(drop table test_datatype1)


14/08/29 12:31:55 INFO DataNucleus.Datastore: The class 
org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as 
embedded-only so does not have its own datastore table.
14/08/29 12:31:55 INFO DataNucleus.Datastore: The class 
org.apache.hadoop.hive.metastore.model.MOrder is tagged as embedded-only so 
does not have its own datastore table.
14/08/29 12:31:55 INFO DataNucleus.Datastore: The class 
org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as 
embedded-only so does not have its own datastore table.
14/08/29 12:31:55 INFO 

Re: Spark Hive max key length is 767 bytes

2014-08-30 Thread arthur.hk.c...@gmail.com
Hi,

Already done but still get the same error:

(I use HIVE 0.13.1 Spark 1.0.2, Hadoop 2.4.1)

Steps:
Step 1) mysql:
 
 alter database hive character set latin1;
Step 2) HIVE:
 hive create table test_datatype2 (testbigint bigint );
 OK
 Time taken: 0.708 seconds
 
 hive drop table test_datatype2;
 OK
 Time taken: 23.272 seconds
Step 3) scala val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
 14/08/29 19:33:52 INFO Configuration.deprecation: 
 mapred.reduce.tasks.speculative.execution is deprecated. Instead, use 
 mapreduce.reduce.speculative
 hiveContext: org.apache.spark.sql.hive.HiveContext = 
 org.apache.spark.sql.hive.HiveContext@395c7b94
 scala hiveContext.hql(“create table test_datatype3 (testbigint bigint)”)
 res0: org.apache.spark.sql.SchemaRDD = 
 SchemaRDD[0] at RDD at SchemaRDD.scala:104
 == Query Plan ==
 Native command: executed by Hive
 scala hiveContext.hql(drop table test_datatype3)
 
 14/08/29 19:34:14 ERROR DataNucleus.Datastore: An exception was thrown while 
 adding/validating class(es) : Specified key was too long; max key length is 
 767 bytes
 com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was 
 too long; max key length is 767 bytes
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 
 14/08/29 19:34:17 WARN DataNucleus.Query: Query for candidates of 
 org.apache.hadoop.hive.metastore.model.MPartition and subclasses resulted in 
 no possible candidates
 Error(s) were found while auto-creating/validating the datastore for 
 classes. The errors are printed in the log, and are attached to this 
 exception.
 org.datanucleus.exceptions.NucleusDataStoreException: Error(s) were found 
 while auto-creating/validating the datastore for classes. The errors are 
 printed in the log, and are attached to this exception.
 at 
 org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.verifyErrors(RDBMSStoreManager.java:3609)
 
 
 Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: 
 Specified key was too long; max key length is 767 bytes
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)



Should I use HIVE 0.12.0 instead of HIVE 0.13.1?

Regards
Arthur

On 31 Aug, 2014, at 6:01 am, Denny Lee denny.g@gmail.com wrote:

 Oh, you may be running into an issue with your MySQL setup actually, try 
 running
 
 alter database metastore_db character set latin1
 
 so that way Hive (and the Spark HiveContext) can execute properly against the 
 metastore.
 
 
 On August 29, 2014 at 04:39:01, arthur.hk.c...@gmail.com 
 (arthur.hk.c...@gmail.com) wrote:
 
 Hi,
 
 
 Tried the same thing in HIVE directly without issue:
 
 HIVE:
 hive create table test_datatype2 (testbigint bigint );
 OK
 Time taken: 0.708 seconds
 
 hive drop table test_datatype2;
 OK
 Time taken: 23.272 seconds
 
 
 
 Then tried again in SPARK:
 scala val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
 14/08/29 19:33:52 INFO Configuration.deprecation: 
 mapred.reduce.tasks.speculative.execution is deprecated. Instead, use 
 mapreduce.reduce.speculative
 hiveContext: org.apache.spark.sql.hive.HiveContext = 
 org.apache.spark.sql.hive.HiveContext@395c7b94
 
 scala val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
 res0: org.apache.spark.sql.SchemaRDD = 
 SchemaRDD[0] at RDD at SchemaRDD.scala:104
 == Query Plan ==
 Native command: executed by Hive
 
 scala hiveContext.hql(drop table test_datatype3)
 
 14/08/29 19:34:14 ERROR DataNucleus.Datastore: An exception was thrown while 
 adding/validating class(es) : Specified key was too long; max key length is 
 767 bytes
 com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was 
 too long; max key length is 767 bytes
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 
 14/08/29 19:34:17 WARN DataNucleus.Query: Query for candidates of 
 org.apache.hadoop.hive.metastore.model.MPartition and subclasses resulted in 
 no possible candidates
 Error(s) were found while auto-creating/validating the datastore for 
 classes. The errors are printed in the log, and are attached to this 
 exception.
 org.datanucleus.exceptions.NucleusDataStoreException: Error(s) were found 
 while auto-creating/validating the datastore for classes. The errors are 
 printed in the log, and are attached to this exception.
 at 
 org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.verifyErrors(RDBMSStoreManager.java:3609)
 
 
 Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: 
 Specified key was too long; max key length is 767 bytes
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 
 14/08/29 19:34:17 INFO 

Re: Spark Hive max key length is 767 bytes

2014-08-29 Thread arthur.hk.c...@gmail.com
Hi,


Tried the same thing in HIVE directly without issue:

HIVE:
hive create table test_datatype2 (testbigint bigint );
OK
Time taken: 0.708 seconds

hive drop table test_datatype2;
OK
Time taken: 23.272 seconds



Then tried again in SPARK:
scala val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
14/08/29 19:33:52 INFO Configuration.deprecation: 
mapred.reduce.tasks.speculative.execution is deprecated. Instead, use 
mapreduce.reduce.speculative
hiveContext: org.apache.spark.sql.hive.HiveContext = 
org.apache.spark.sql.hive.HiveContext@395c7b94

scala val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
res0: org.apache.spark.sql.SchemaRDD = 
SchemaRDD[0] at RDD at SchemaRDD.scala:104
== Query Plan ==
Native command: executed by Hive

scala hiveContext.hql(drop table test_datatype3)

14/08/29 19:34:14 ERROR DataNucleus.Datastore: An exception was thrown while 
adding/validating class(es) : Specified key was too long; max key length is 767 
bytes
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was 
too long; max key length is 767 bytes
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

14/08/29 19:34:17 WARN DataNucleus.Query: Query for candidates of 
org.apache.hadoop.hive.metastore.model.MPartition and subclasses resulted in no 
possible candidates
Error(s) were found while auto-creating/validating the datastore for classes. 
The errors are printed in the log, and are attached to this exception.
org.datanucleus.exceptions.NucleusDataStoreException: Error(s) were found while 
auto-creating/validating the datastore for classes. The errors are printed in 
the log, and are attached to this exception.
at 
org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.verifyErrors(RDBMSStoreManager.java:3609)


Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified 
key was too long; max key length is 767 bytes
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

14/08/29 19:34:17 INFO DataNucleus.Datastore: The class 
org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as 
embedded-only so does not have its own datastore table.
14/08/29 19:34:17 INFO DataNucleus.Datastore: The class 
org.apache.hadoop.hive.metastore.model.MOrder is tagged as embedded-only so 
does not have its own datastore table.
14/08/29 19:34:17 INFO DataNucleus.Datastore: The class 
org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as 
embedded-only so does not have its own datastore table.
14/08/29 19:34:17 INFO DataNucleus.Datastore: The class 
org.apache.hadoop.hive.metastore.model.MOrder is tagged as embedded-only so 
does not have its own datastore table.
14/08/29 19:34:17 INFO DataNucleus.Datastore: The class 
org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as 
embedded-only so does not have its own datastore table.
14/08/29 19:34:17 INFO DataNucleus.Datastore: The class 
org.apache.hadoop.hive.metastore.model.MOrder is tagged as embedded-only so 
does not have its own datastore table.
14/08/29 19:34:25 ERROR DataNucleus.Datastore: An exception was thrown while 
adding/validating class(es) : Specified key was too long; max key length is 767 
bytes
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was 
too long; max key length is 767 bytes
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)


Can anyone please help?

Regards
Arthur


On 29 Aug, 2014, at 12:47 pm, arthur.hk.c...@gmail.com 
arthur.hk.c...@gmail.com wrote:

 (Please ignore if duplicated) 
 
 
 Hi,
 
 I use Spark 1.0.2 with Hive 0.13.1
 
 I have already set the hive mysql database to latine1; 
 
 mysql:
 alter database hive character set latin1;
 
 Spark:
 scala val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
 scala hiveContext.hql(create table test_datatype1 (testbigint bigint ))
 scala hiveContext.hql(drop table test_datatype1)
 
 
 14/08/29 12:31:55 INFO DataNucleus.Datastore: The class 
 org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as 
 embedded-only so does not have its own datastore table.
 14/08/29 12:31:55 INFO DataNucleus.Datastore: The class 
 org.apache.hadoop.hive.metastore.model.MOrder is tagged as embedded-only 
 so does not have its own datastore table.
 14/08/29 12:31:55 INFO DataNucleus.Datastore: The class 
 org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as 
 embedded-only so does not have its own datastore table.
 14/08/29 12:31:55 INFO DataNucleus.Datastore: The class 
 org.apache.hadoop.hive.metastore.model.MOrder is tagged as embedded-only 
 so does not have its own datastore table.
 14/08/29 12:31:59 ERROR DataNucleus.Datastore: An exception was thrown while 
 

Re: Spark Hive max key length is 767 bytes

2014-08-29 Thread Michael Armbrust
Spark SQL is based on Hive 12.  They must have changed the maximum key size
between 12 and 13.


On Fri, Aug 29, 2014 at 4:38 AM, arthur.hk.c...@gmail.com 
arthur.hk.c...@gmail.com wrote:

 Hi,


 Tried the same thing in HIVE directly without issue:

 HIVE:
 hive create table test_datatype2 (testbigint bigint );
 OK
 Time taken: 0.708 seconds

 hive drop table test_datatype2;
 OK
 Time taken: 23.272 seconds



 Then tried again in SPARK:
 scala val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
 14/08/29 19:33:52 INFO Configuration.deprecation:
 mapred.reduce.tasks.speculative.execution is deprecated. Instead, use
 mapreduce.reduce.speculative
 hiveContext: org.apache.spark.sql.hive.HiveContext =
 org.apache.spark.sql.hive.HiveContext@395c7b94

 scala val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
 res0: org.apache.spark.sql.SchemaRDD =
 SchemaRDD[0] at RDD at SchemaRDD.scala:104
 == Query Plan ==
 Native command: executed by Hive

 scala hiveContext.hql(drop table test_datatype3)

 14/08/29 19:34:14 ERROR DataNucleus.Datastore: An exception was thrown
 while adding/validating class(es) : Specified key was too long; max key
 length is 767 bytes
 com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key
 was too long; max key length is 767 bytes
  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

 14/08/29 19:34:17 WARN DataNucleus.Query: Query for candidates of
 org.apache.hadoop.hive.metastore.model.MPartition and subclasses resulted
 in no possible candidates
 Error(s) were found while auto-creating/validating the datastore for
 classes. The errors are printed in the log, and are attached to this
 exception.
 org.datanucleus.exceptions.NucleusDataStoreException: Error(s) were found
 while auto-creating/validating the datastore for classes. The errors are
 printed in the log, and are attached to this exception.
  at
 org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.verifyErrors(RDBMSStoreManager.java:3609)


 Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException:
 Specified key was too long; max key length is 767 bytes
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

 14/08/29 19:34:17 INFO DataNucleus.Datastore: The class
 org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as
 embedded-only so does not have its own datastore table.
 14/08/29 19:34:17 INFO DataNucleus.Datastore: The class
 org.apache.hadoop.hive.metastore.model.MOrder is tagged as
 embedded-only so does not have its own datastore table.
 14/08/29 19:34:17 INFO DataNucleus.Datastore: The class
 org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as
 embedded-only so does not have its own datastore table.
 14/08/29 19:34:17 INFO DataNucleus.Datastore: The class
 org.apache.hadoop.hive.metastore.model.MOrder is tagged as
 embedded-only so does not have its own datastore table.
 14/08/29 19:34:17 INFO DataNucleus.Datastore: The class
 org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as
 embedded-only so does not have its own datastore table.
 14/08/29 19:34:17 INFO DataNucleus.Datastore: The class
 org.apache.hadoop.hive.metastore.model.MOrder is tagged as
 embedded-only so does not have its own datastore table.
 14/08/29 19:34:25 ERROR DataNucleus.Datastore: An exception was thrown
 while adding/validating class(es) : Specified key was too long; max key
 length is 767 bytes
 com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key
 was too long; max key length is 767 bytes
  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)


 Can anyone please help?

 Regards
 Arthur


 On 29 Aug, 2014, at 12:47 pm, arthur.hk.c...@gmail.com 
 arthur.hk.c...@gmail.com wrote:

 (Please ignore if duplicated)


 Hi,

 I use Spark 1.0.2 with Hive 0.13.1

 I have already set the hive mysql database to latine1;

 mysql:
 alter database hive character set latin1;

 Spark:
 scala val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
 scala hiveContext.hql(create table test_datatype1 (testbigint bigint ))
 scala hiveContext.hql(drop table test_datatype1)


 14/08/29 12:31:55 INFO DataNucleus.Datastore: The class
 org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as
 embedded-only so does not have its own datastore table.
 14/08/29 12:31:55 INFO DataNucleus.Datastore: The class
 org.apache.hadoop.hive.metastore.model.MOrder is tagged as
 embedded-only so does not have its own datastore table.
 14/08/29 12:31:55 INFO DataNucleus.Datastore: The class
 org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as
 embedded-only so does not have its own datastore table.
 14/08/29 12:31:55 INFO DataNucleus.Datastore: The class
 

Spark Hive max key length is 767 bytes

2014-08-28 Thread arthur.hk.c...@gmail.com
(Please ignore if duplicated) 


Hi,

I use Spark 1.0.2 with Hive 0.13.1

I have already set the hive mysql database to latine1; 

mysql:
alter database hive character set latin1;

Spark:
scala val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
scala hiveContext.hql(create table test_datatype1 (testbigint bigint ))
scala hiveContext.hql(drop table test_datatype1)


14/08/29 12:31:55 INFO DataNucleus.Datastore: The class 
org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as 
embedded-only so does not have its own datastore table.
14/08/29 12:31:55 INFO DataNucleus.Datastore: The class 
org.apache.hadoop.hive.metastore.model.MOrder is tagged as embedded-only so 
does not have its own datastore table.
14/08/29 12:31:55 INFO DataNucleus.Datastore: The class 
org.apache.hadoop.hive.metastore.model.MFieldSchema is tagged as 
embedded-only so does not have its own datastore table.
14/08/29 12:31:55 INFO DataNucleus.Datastore: The class 
org.apache.hadoop.hive.metastore.model.MOrder is tagged as embedded-only so 
does not have its own datastore table.
14/08/29 12:31:59 ERROR DataNucleus.Datastore: An exception was thrown while 
adding/validating class(es) : Specified key was too long; max key length is 767 
bytes
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was 
too long; max key length is 767 bytes
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:408)
at com.mysql.jdbc.Util.getInstance(Util.java:383)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1062)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4226)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4158)

Can you please advise what would be wrong?

Regards
Arthur