[ https://issues.apache.org/jira/browse/PHOENIX-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630890#comment-17630890 ]
Istvan Toth commented on PHOENIX-5066: -------------------------------------- I now have a functinally complete (still needs optimizations and cleanups) WIP patch at [https://github.com/apache/phoenix/pull/1504] At the heart of the solution is the new ExpressionContext object. This object encapsulates the context necessary for evaluating an expression. It is tied to a PhoenixConnection object, and can be configured by properties. At the moment it stores the TimeZone, and the format strings. We have two implementations, one of the them is GMTExpressionContext, which implements the old date handling behaviour, which treats dates (mostly, at least on the server side as GMT timestamps. The new implmentation is CompliantExpressionContext, which is initialized by the client TimeZone (can be overriden), and uses it for String parse/printing, and for implementing the date functions. The current Phoenix epxression code, especially the singleton type system is very resistant to adding this context to it, so my implementation uses a ThreadLocal variable to store and access the ExpressionContext. The big challange is making sure that we propagate the ExpressionContext through all the components of an statement execution. - We're creating new threads for processing scan, and we need to make sure that ExpressionContext ThreadLocal is propagated. - We're also maniplating the classloader, we need to make sure that we copy the Expression TL to the new Classloader. - We need to push the context to the coprocessors. We do this by encoding the Context into Scan properties, and processing those in the coprocessors. - the Expression reconstruction happens BEFORE we rebuild the ExpressionContext TL in the coprocessor hook, so we need to lazily evaluate the context when executing the expression. - There are a few hacky places where we create a ConnectionlessQueryServices object on the server side (to handle default expressions). I have added new fields to the relavant RPC definitions in MetaDataService.proto to push the ExpressionContext into the operations. > The TimeZone is incorrectly used during writing or reading data > --------------------------------------------------------------- > > Key: PHOENIX-5066 > URL: https://issues.apache.org/jira/browse/PHOENIX-5066 > Project: Phoenix > Issue Type: Bug > Affects Versions: 5.0.0, 4.14.1 > Reporter: Jaanai Zhang > Assignee: Istvan Toth > Priority: Critical > Fix For: 5.3.0 > > Attachments: DateTest.java, PHOENIX-5066.4x.v1.patch, > PHOENIX-5066.4x.v2.patch, PHOENIX-5066.4x.v3.patch, > PHOENIX-5066.master.v1.patch, PHOENIX-5066.master.v2.patch, > PHOENIX-5066.master.v3.patch, PHOENIX-5066.master.v4.patch, > PHOENIX-5066.master.v5.patch, PHOENIX-5066.master.v6.patch > > Time Spent: 20m > Remaining Estimate: 0h > > We have two methods to write data when uses JDBC API. > #1. Uses _the exceuteUpdate_ method to execute a string that is an upsert SQL. > #2. Uses the _prepareStatement_ method to set some objects and execute. > The _string_ data needs to convert to a new object by the schema information > of tables. we'll use some date formatters to convert string data to object > for Date/Time/Timestamp types when writes data and the formatters are used > when reads data as well. > > *Uses default timezone test* > Writing 3 records by the different ways. > {code:java} > UPSERT INTO date_test VALUES (1,'2018-12-10 15:40:47','2018-12-10 > 15:40:47','2018-12-10 15:40:47') > UPSERT INTO date_test VALUES (2,to_date('2018-12-10 > 15:40:47'),to_time('2018-12-10 15:40:47'),to_timestamp('2018-12-10 15:40:47')) > stmt.setInt(1, 3);stmt.setDate(2, date);stmt.setTime(3, > time);stmt.setTimestamp(4, ts); > {code} > Reading the table by the getObject(getDate/getTime/getTimestamp) methods. > {code:java} > 1 | 2018-12-10 | 23:45:07 | 2018-12-10 23:45:07.0 > 2 | 2018-12-10 | 23:45:07 | 2018-12-10 23:45:07.0 > 3 | 2018-12-10 | 15:45:07 | 2018-12-10 15:45:07.66 > {code} > Reading the table by the getString methods > {code:java} > 1 | 2018-12-10 15:45:07.000 | 2018-12-10 15:45:07.000 | 2018-12-10 > 15:45:07.000 > 2 | 2018-12-10 15:45:07.000 | 2018-12-10 15:45:07.000 | 2018-12-10 > 15:45:07.000 > 3 | 2018-12-10 07:45:07.660 | 2018-12-10 07:45:07.660 | 2018-12-10 > 07:45:07.660 > {code} > *Uses GMT+8 test* > Writing 3 records by the different ways. > {code:java} > UPSERT INTO date_test VALUES (1,'2018-12-10 15:40:47','2018-12-10 > 15:40:47','2018-12-10 15:40:47') > UPSERT INTO date_test VALUES (2,to_date('2018-12-10 > 15:40:47'),to_time('2018-12-10 15:40:47'),to_timestamp('2018-12-10 15:40:47')) > stmt.setInt(1, 3);stmt.setDate(2, date);stmt.setTime(3, > time);stmt.setTimestamp(4, ts); > {code} > Reading the table by the getObject(getDate/getTime/getTimestamp) methods. > {code:java} > 1 | 2018-12-10 | 23:40:47 | 2018-12-10 23:40:47.0 > 2 | 2018-12-10 | 15:40:47 | 2018-12-10 15:40:47.0 > 3 | 2018-12-10 | 15:40:47 | 2018-12-10 15:40:47.106 {code} > Reading the table by the getString methods > {code:java} > 1 | 2018-12-10 23:40:47.000 | 2018-12-10 23:40:47.000 | 2018-12-10 > 23:40:47.000 > 2 | 2018-12-10 15:40:47.000 | 2018-12-10 15:40:47.000 | 2018-12-10 > 15:40:47.000 > 3 | 2018-12-10 15:40:47.106 | 2018-12-10 15:40:47.106 | 2018-12-10 > 15:40:47.106 > {code} > > _We_ have a historical problem, we'll parse the string to > Date/Time/Timestamp objects with timezone in #1, which means the actual data > is going to be changed when stored in HBase table。 -- This message was sent by Atlassian Jira (v8.20.10#820010)