[jira] [Commented] (HIVE-8102) Partitions of type 'date' behave incorrectly with daylight saving time.

2014-09-16 Thread Eli Acherkan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14135083#comment-14135083
 ] 

Eli Acherkan commented on HIVE-8102:


Thanks [~jdere]! The patch appears to work well for us. (Haven't tested on 
other timezones.)

 Partitions of type 'date' behave incorrectly with daylight saving time.
 ---

 Key: HIVE-8102
 URL: https://issues.apache.org/jira/browse/HIVE-8102
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema, Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Eli Acherkan
 Attachments: HIVE-8102.1.patch


 On 2AM on March 28th 2014, Israel went from standard time (GMT+2) to daylight 
 saving time (GMT+3).
 The server's timezone is Asia/Jerusalem. When creating a partition whose key 
 is 2014-03-28, Hive creates a partition for 2013-03-27 instead:
 hive (default) create table test (a int) partitioned by (`b_prt` date);
 OK
 Time taken: 0.092 seconds
 hive (default) alter table test add partition (b_prt='2014-03-28');
 OK
 Time taken: 0.187 seconds
 hive (default) show partitions test;   
 OK
 partition
 b_prt=2014-03-27
 Time taken: 0.134 seconds, Fetched: 1 row(s)
 It seems that the root cause is the behavior of 
 DateWritable.daysToMillis/dateToDays.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8102) Partitions of type 'date' behave incorrectly with daylight saving time.

2014-09-15 Thread Eli Acherkan (JIRA)
Eli Acherkan created HIVE-8102:
--

 Summary: Partitions of type 'date' behave incorrectly with 
daylight saving time.
 Key: HIVE-8102
 URL: https://issues.apache.org/jira/browse/HIVE-8102
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema, Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Eli Acherkan


On 2AM on March 28th 2014, Israel went from standard time (GMT+2) to daylight 
saving time (GMT+3).
The server's timezone is Asia/Jerusalem. When creating a partition whose key is 
2014-03-28, Hive creates a partition for 2013-03-27 instead:

hive (default) create table test (a int) partitioned by (`b_prt` date);
OK
Time taken: 0.092 seconds
hive (default) alter table test add partition (b_prt='2014-03-28');
OK
Time taken: 0.187 seconds
hive (default) show partitions test;   
OK
partition
b_prt=2014-03-27
Time taken: 0.134 seconds, Fetched: 1 row(s)

It seems that the root cause is the behavior of 
DateWritable.daysToMillis/dateToDays.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8102) Partitions of type 'date' behave incorrectly with daylight saving time.

2014-09-15 Thread Eli Acherkan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133985#comment-14133985
 ] 

Eli Acherkan commented on HIVE-8102:


The following test fails when running in Asia/Jerusalem timezone:
Date originalDate = new Date(114, 2, 28); // March 28th 2014 - DST 
begins on this day at 02:00.
DateWritable dateWritable = new DateWritable(originalDate);
assertEquals(originalDate, dateWritable.get()); // Assertion fails 
because dateWritable.get() returns 2014-03-27 23:00:00 IST.

In order to be able to run this unit test in any timezone, we explicitly set 
the timezone and run it in a separate thread, so that the thread local member 
DataWritable.LOCAL_TIMEZONE is initialized with the correct one:
public void testDaylightSavingsTime() throws InterruptedException, 
ExecutionException {
TimeZone previousDefault = TimeZone.getDefault();
TimeZone.setDefault(TimeZone.getTimeZone(Asia/Jerusalem));
ExecutorService threadPool = Executors.newFixedThreadPool(1);
try {
FutureBoolean future = threadPool.submit(new 
CallableBoolean() {

@Override
public Boolean call() throws Exception {
Date originalDate = new Date(114, 2, 
28); // March 28th 2014 - DST begins on this day at 02:00.
DateWritable dateWritable = new 
DateWritable(originalDate);
return 
originalDate.equals(dateWritable.get());
}
});
assertTrue(future.get());
} finally {
threadPool.shutdown();
TimeZone.setDefault(previousDefault);
}
}


 Partitions of type 'date' behave incorrectly with daylight saving time.
 ---

 Key: HIVE-8102
 URL: https://issues.apache.org/jira/browse/HIVE-8102
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema, Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Eli Acherkan

 On 2AM on March 28th 2014, Israel went from standard time (GMT+2) to daylight 
 saving time (GMT+3).
 The server's timezone is Asia/Jerusalem. When creating a partition whose key 
 is 2014-03-28, Hive creates a partition for 2013-03-27 instead:
 hive (default) create table test (a int) partitioned by (`b_prt` date);
 OK
 Time taken: 0.092 seconds
 hive (default) alter table test add partition (b_prt='2014-03-28');
 OK
 Time taken: 0.187 seconds
 hive (default) show partitions test;   
 OK
 partition
 b_prt=2014-03-27
 Time taken: 0.134 seconds, Fetched: 1 row(s)
 It seems that the root cause is the behavior of 
 DateWritable.daysToMillis/dateToDays.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6113) Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

2014-04-08 Thread Eli Acherkan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963055#comment-13963055
 ] 

Eli Acherkan commented on HIVE-6113:


The exact same issue reproduces here. Hive 0.12 on MapR 3.1.0 with MySQL 
metastore. The exception appears when there are several processes working with 
Hive concurrently.

From our analysis the problem seems related to the one described here: 
http://mail-archives.apache.org/mod_mbox/hive-user/201107.mbox/%3c4f6b25afffcafe44b6259a412d5f9b1033183...@exchmbx104.netflix.com%3E

h5. Analysis:
At certain times, Hive's DataNucleus decides to create and then drop tables 
called DELETEME+timestamp in the metastore schema on MySQL (see 
[ProbleTable|http://sourceforge.net/p/datanucleus/code/HEAD/tree/platform/store.rdbms/tags/datanucleus-rdbms-3.2.2/src/java/org/datanucleus/store/rdbms/table/ProbeTable.java]).

During other flows, DataNucleus queries MySQL for the list of all the columns 
of all the tables (see 
[RDBMSSchemaHandler.refreshTableData|http://sourceforge.net/p/datanucleus/code/HEAD/tree/platform/store.rdbms/tags/datanucleus-rdbms-3.2.2/src/java/org/datanucleus/store/rdbms/schema/RDBMSSchemaHandler.java#l872]).
 MySQL's JDBC driver implements the DatabaseMetaData.getColumns method by 
querying the DB for a list of all the tables, and then iterating over that list 
and querying for each table's columns (see 
[com.mysql.jdbc.DatabaseMetaData|http://bazaar.launchpad.net/~mysql/connectorj/5.1/view/head:/src/com/mysql/jdbc/DatabaseMetaData.java#L2581]).
 If a table is deleted from the DB during this operation, 
DatabaseMetaData.getColumns will throw an exception.

This exception is interpreted by Hive to mean that the default Hive database 
doesn't exist. Hive tries to create it, inserting a row into the metastore.DBS 
table in MySQL, which triggers the Duplicate entry 'default' for key 
'UNIQUE_DATABASE' exception.

I'm not completely clear about the conditions for a) DataNucleus creating and 
dropping a DELETEME table, and b) DataNucleus calling 
DatabaseMetaData.getColumns, so unfortunately I can't yet provide a clear test 
case. But in our lab environment under load we were able to reproduce the 
exception once every few minutes.

h5. Workaround:
As suggested by the link above, setting the *datanucleus.fixedDatastore* 
property to *true* (e.g. in hive-site.xml or elsewhere) seems to solve the 
problem. However, it means that the metastore schema is no longer automatically 
created on-demand, and requires using Hive's schematool command to manually 
create the metastore schema.

 Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
 --

 Key: HIVE-6113
 URL: https://issues.apache.org/jira/browse/HIVE-6113
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema
Affects Versions: 0.12.0
 Environment: hadoop-0.20.2-cdh3u3,hive-0.12.0
Reporter: William Stone
Priority: Critical
  Labels: HiveMetaStoreClient, metastore, unable_instantiate

 When I exccute SQL use fdm; desc formatted fdm.tableName;  in python, throw 
 Error as followed.
 but when I tryit again , It will success.
 2013-12-25 03:01:32,290 ERROR exec.DDLTask (DDLTask.java:execute(435)) - 
 org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: 
 Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
   at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1143)
   at 
 org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1128)
   at 
 org.apache.hadoop.hive.ql.exec.DDLTask.switchDatabase(DDLTask.java:3479)
   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:237)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:260)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:217)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:507)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:875)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:769)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:708)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at