[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282638#comment-16282638 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss closed pull request #1330: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/java/vector/pom.xml b/java/vector/pom.xml index 46e06aa1e..b436f5f9c 100644 --- a/java/vector/pom.xml +++ b/java/vector/pom.xml @@ -135,6 +135,13 @@ org.apache.drill.tools drill-fmpp-maven-plugin 1.5.0 + + +org.freemarker +freemarker +2.3.23 + + generate-fmpp diff --git a/java/vector/src/main/codegen/data/ValueVectorTypes.tdd b/java/vector/src/main/codegen/data/ValueVectorTypes.tdd index 970d887c7..565174a4d 100644 --- a/java/vector/src/main/codegen/data/ValueVectorTypes.tdd +++ b/java/vector/src/main/codegen/data/ValueVectorTypes.tdd @@ -73,26 +73,10 @@ { class: "UInt8" }, { class: "Float8", javaType: "double", boxedType: "Double", fields: [{name: "value", type: "double"}] }, { class: "DateMilli",javaType: "long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampSec", javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampMilli", javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampMicro", javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampNano",javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampSecTZ", javaType: "long", boxedType: "Long", - typeParams: [ {name: "timezone", type: "String"} ], - arrowType: "org.apache.arrow.vector.types.pojo.ArrowType.Timestamp", - arrowTypeConstructorParams: ["org.apache.arrow.vector.types.TimeUnit.SECOND", "timezone"] }, -{ class: "TimeStampMilliTZ", javaType: "long", boxedType: "Long", - typeParams: [ {name: "timezone", type: "String"} ], - arrowType: "org.apache.arrow.vector.types.pojo.ArrowType.Timestamp", - arrowTypeConstructorParams: ["org.apache.arrow.vector.types.TimeUnit.MILLISECOND", "timezone"] }, -{ class: "TimeStampMicroTZ", javaType: "long", boxedType: "Long", - typeParams: [ {name: "timezone", type: "String"} ], - arrowType: "org.apache.arrow.vector.types.pojo.ArrowType.Timestamp", - arrowTypeConstructorParams: ["org.apache.arrow.vector.types.TimeUnit.MICROSECOND", "timezone"] }, -{ class: "TimeStampNanoTZ", javaType: "long", boxedType: "Long", - typeParams: [ {name: "timezone", type: "String"} ], - arrowType: "org.apache.arrow.vector.types.pojo.ArrowType.Timestamp", - arrowTypeConstructorParams: ["org.apache.arrow.vector.types.TimeUnit.NANOSECOND", "timezone"] }, +{ class: "Timestamp",javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" + typeParams: [ {name: "unit", type: "TimeUnit"}, { name: "timezone", type: "String"} ], + arrowType: "org.apache.arrow.vector.types.pojo.ArrowType.Timestamp", +}, { class: "TimeMicro" }, { class: "TimeNano" } ] @@ -116,7 +100,7 @@ { class: "Decimal", maxPrecisionDigits: 38, nDecimalDigits: 4, friendlyType: "BigDecimal", - typeParams: [ {name: "scale", type: "int"}, { name: "precision", type: "int"}], + typeParams: [ { name: "precision", type: "int"}, {name: "scale", type: "int"} ], arrowType: "org.apache.arrow.vector.types.pojo.ArrowType.Decimal", fields: [{name: "start", type: "int"}, {name: "buffer", type: "ArrowBuf"}] } diff --git a/java/vector/src/main/codegen/includes/vv_imports.ftl b/java/vector/src/main/codegen/includes/vv_imports.ftl index a55304d73..28a8953e2 100644 --- a/java/vector/src/main/codegen/includes/vv_imports.ftl +++ b/java/vector/src/main/codegen/includes/vv_imports.ftl @@ -55,6 +55,7 @@ import java.math.BigDecimal; import java.math.BigInteger; import
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271562#comment-16271562 ] ASF GitHub Bot commented on ARROW-1816: --- BryanCutler commented on a change in pull request #1330: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r153920084 ## File path: java/vector/src/main/codegen/templates/AbstractPromotableFieldWriter.java ## @@ -42,6 +43,8 @@ * @param type the type of the values we want to write * @return the corresponding field writer */ + abstract protected FieldWriter getWriter(ArrowType type); + abstract protected FieldWriter getWriter(MinorType type); Review comment: Ok, sounds good This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin >Assignee: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271065#comment-16271065 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on a change in pull request #1330: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r153844744 ## File path: java/vector/src/main/codegen/templates/AbstractPromotableFieldWriter.java ## @@ -42,6 +43,8 @@ * @param type the type of the values we want to write * @return the corresponding field writer */ + abstract protected FieldWriter getWriter(ArrowType type); + abstract protected FieldWriter getWriter(MinorType type); Review comment: I agree ideally we should have just one. However currently, this is needed to support this: ``` TimestampWriter timestampWriter = rootWriter.timestamp("a", TimeUnit.SECOND, "America/New_York"); timestampWriter.writeTimestamp(1000) ``` The second line here, the type of "timestampWriter" is a promotable writer and it needs to lookup the writer by `MinorType.TIMESTAMP` We need to do more refactoring to support this API without `abstract protected FieldWriter getWriter(MinorType type)`, but since this is going to be internal change and this PR is already quite complicated, I prefer this to be a follow up. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin >Assignee: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271063#comment-16271063 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on a change in pull request #1330: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r153844744 ## File path: java/vector/src/main/codegen/templates/AbstractPromotableFieldWriter.java ## @@ -42,6 +43,8 @@ * @param type the type of the values we want to write * @return the corresponding field writer */ + abstract protected FieldWriter getWriter(ArrowType type); + abstract protected FieldWriter getWriter(MinorType type); Review comment: I agree ideally we should have just one. However currently, this is needed to support this: ``` TimestampWriter timestampWriter = rootWriter.timestamp("a", TimeUnit.SECOND, "America/New_York"); timestampWriter.writeTimestamp(1000) ``` The second line here, the type "timestampWriter" is a promotable writer and it needs to lookup the writer by `MinorType.TIMESTAMP` We need to do more refactoring to support this API without `abstract protected FieldWriter getWriter(MinorType type)`, but since this is going to be internal change and this PR is already quite complicated, I prefer this to be a follow up. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin >Assignee: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271062#comment-16271062 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on a change in pull request #1330: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r153844744 ## File path: java/vector/src/main/codegen/templates/AbstractPromotableFieldWriter.java ## @@ -42,6 +43,8 @@ * @param type the type of the values we want to write * @return the corresponding field writer */ + abstract protected FieldWriter getWriter(ArrowType type); + abstract protected FieldWriter getWriter(MinorType type); Review comment: I agree ideally we should have just one. However currently, this is needed to support this: ``` TimestampWriter timestampWriter = rootWriter.timestamp("a", TimeUnit.SECOND, "America/New_York"); timestampWriter.writeTimestamp(1000) ``` The second line here needs to lookup the writer by `MinorType.TIMESTAMP` We need to do more refactoring to support this API without `abstract protected FieldWriter getWriter(MinorType type)`, but since this is going to be internal change and this PR is already quite complicated, I prefer this to be a follow up. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin >Assignee: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16269662#comment-16269662 ] ASF GitHub Bot commented on ARROW-1816: --- BryanCutler commented on a change in pull request #1330: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r153652065 ## File path: java/vector/src/main/codegen/templates/AbstractPromotableFieldWriter.java ## @@ -42,6 +43,8 @@ * @param type the type of the values we want to write * @return the corresponding field writer */ + abstract protected FieldWriter getWriter(ArrowType type); + abstract protected FieldWriter getWriter(MinorType type); Review comment: Is is possible to just have `getWriter(ArrowType)`? Having 2 methods so similar is a little confusing.. This is because you need a writer for timestamp type params too? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin >Assignee: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16269661#comment-16269661 ] ASF GitHub Bot commented on ARROW-1816: --- BryanCutler commented on a change in pull request #1330: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r153652065 ## File path: java/vector/src/main/codegen/templates/AbstractPromotableFieldWriter.java ## @@ -42,6 +43,8 @@ * @param type the type of the values we want to write * @return the corresponding field writer */ + abstract protected FieldWriter getWriter(ArrowType type); + abstract protected FieldWriter getWriter(MinorType type); Review comment: Is is possible to just have `getWriter(ArrowType)`? Having 2 methods so similar is a little confusing.. This is because you need a writer for a specific time unit now? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin >Assignee: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16269656#comment-16269656 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on a change in pull request #1330: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r153651349 ## File path: java/vector/pom.xml ## @@ -135,6 +135,13 @@ org.apache.drill.tools drill-fmpp-maven-plugin 1.5.0 + + +org.freemarker +freemarker +2.3.23 + + Review comment: Yes I think that's the version comes with the plugin. I used <#sep> notation in the template to created comma separated arg list. The notation is introduced in 2.3.23 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin >Assignee: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16269650#comment-16269650 ] ASF GitHub Bot commented on ARROW-1816: --- BryanCutler commented on a change in pull request #1330: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r153650965 ## File path: java/vector/pom.xml ## @@ -135,6 +135,13 @@ org.apache.drill.tools drill-fmpp-maven-plugin 1.5.0 + + +org.freemarker +freemarker +2.3.23 + + Review comment: So without specifying the version it defaults to 2.3.21? Was that version causing an issue? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin >Assignee: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16269462#comment-16269462 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on issue #1330: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#issuecomment-347663530 @wesm I rebased after ARROW-1770. This should be ready. @jacques-n can you take a look? I'd really like this to get in 0.8 if this looks good. Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin >Assignee: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16269124#comment-16269124 ] ASF GitHub Bot commented on ARROW-1816: --- wesm commented on issue #1330: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#issuecomment-347603490 Needs rebase after ARROW-1710. If we are going to change this class structure, 0.8.0 would be the right time to do it. I'm concerned about delaying the release much further though -- if we can't get the Java work wrapped up this week, we can probably take another week, but we're getting into the red zone as far as the Spark 2.3.0 timeline is concerned This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin >Assignee: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16269098#comment-16269098 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on a change in pull request #1330: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r153566461 ## File path: java/vector/src/main/java/org/apache/arrow/vector/NullableTimestampVector.java ## @@ -179,6 +258,37 @@ public static long get(final ArrowBuf buffer, final int index) { return buffer.getLong(index * TYPE_WIDTH); } + public void get(int index, NullableTimestampHolder holder) { +if (isSet(index) == 0) { + holder.isSet = 0; + return; +} +holder.isSet = 1; +holder.value = valueBuffer.getLong(index * TYPE_WIDTH); + } + + @Override + public LocalDateTime getObject(int index) { +if (isSet(index) == 0) { + return null; +} else { + long millis = unit.getTimeUnit().toMillis(get(index)); Review comment: Good point. I encapsulated java TimeUnit detail. Also I ran a microbench toMilliis for 1B long values. Performance is the same. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin >Assignee: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264683#comment-16264683 ] ASF GitHub Bot commented on ARROW-1816: --- wesm commented on a change in pull request #1330: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r152859038 ## File path: java/vector/src/main/java/org/apache/arrow/vector/NullableTimestampVector.java ## @@ -179,6 +258,37 @@ public static long get(final ArrowBuf buffer, final int index) { return buffer.getLong(index * TYPE_WIDTH); } + public void get(int index, NullableTimestampHolder holder) { +if (isSet(index) == 0) { + holder.isSet = 0; + return; +} +holder.isSet = 1; +holder.value = valueBuffer.getLong(index * TYPE_WIDTH); + } + + @Override + public LocalDateTime getObject(int index) { +if (isSet(index) == 0) { + return null; +} else { + long millis = unit.getTimeUnit().toMillis(get(index)); Review comment: Would it be more efficient to have a `toMillis` method on `TimeUnit` (and encapsulate this detail per law of demeter)? I guess we need to run some microbenchmarks to be able to judge performance This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin >Assignee: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263038#comment-16263038 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on a change in pull request #1330: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r152642480 ## File path: java/vector/src/main/java/org/apache/arrow/vector/types/TimeUnit.java ## @@ -19,10 +19,10 @@ package org.apache.arrow.vector.types; public enum TimeUnit { - SECOND(org.apache.arrow.flatbuf.TimeUnit.SECOND), - MILLISECOND(org.apache.arrow.flatbuf.TimeUnit.MILLISECOND), - MICROSECOND(org.apache.arrow.flatbuf.TimeUnit.MICROSECOND), - NANOSECOND(org.apache.arrow.flatbuf.TimeUnit.NANOSECOND); + SECOND(org.apache.arrow.flatbuf.TimeUnit.SECOND, java.util.concurrent.TimeUnit.SECONDS), + MILLISECOND(org.apache.arrow.flatbuf.TimeUnit.MILLISECOND, java.util.concurrent.TimeUnit.MILLISECONDS), + MICROSECOND(org.apache.arrow.flatbuf.TimeUnit.MICROSECOND, java.util.concurrent.TimeUnit.MICROSECONDS), + NANOSECOND(org.apache.arrow.flatbuf.TimeUnit.NANOSECOND, java.util.concurrent.TimeUnit.NANOSECONDS); Review comment: Enum equality should just be identity check: https://stackoverflow.com/questions/533922/is-it-ok-to-use-on-enums-in-java This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262940#comment-16262940 ] ASF GitHub Bot commented on ARROW-1816: --- BryanCutler commented on a change in pull request #1330: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r152630009 ## File path: java/vector/src/main/java/org/apache/arrow/vector/types/TimeUnit.java ## @@ -19,10 +19,10 @@ package org.apache.arrow.vector.types; public enum TimeUnit { - SECOND(org.apache.arrow.flatbuf.TimeUnit.SECOND), - MILLISECOND(org.apache.arrow.flatbuf.TimeUnit.MILLISECOND), - MICROSECOND(org.apache.arrow.flatbuf.TimeUnit.MICROSECOND), - NANOSECOND(org.apache.arrow.flatbuf.TimeUnit.NANOSECOND); + SECOND(org.apache.arrow.flatbuf.TimeUnit.SECOND, java.util.concurrent.TimeUnit.SECONDS), + MILLISECOND(org.apache.arrow.flatbuf.TimeUnit.MILLISECOND, java.util.concurrent.TimeUnit.MILLISECONDS), + MICROSECOND(org.apache.arrow.flatbuf.TimeUnit.MICROSECOND, java.util.concurrent.TimeUnit.MICROSECONDS), + NANOSECOND(org.apache.arrow.flatbuf.TimeUnit.NANOSECOND, java.util.concurrent.TimeUnit.NANOSECONDS); Review comment: Well before the enum was defined by the flatbufId, but here it is the flatbufId and the Java TimeUnit which is redundant information. So a check for equality would have to look at both of these fields. Maybe not a big deal though.. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262907#comment-16262907 ] ASF GitHub Bot commented on ARROW-1816: --- BryanCutler commented on issue #1330: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#issuecomment-346408827 > @BryanCutler I am a bit reluctant to check unit and timezone in value holders because of performance reasons. Yeah it doesn't make sense to do all these checks each access, so I just wanted to pose the question to make sure it wasn't a blocker for doing this refactor. I don't use the holder APIs so it's fine with me but maybe @siddharthteotia has some thoughts on this? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262805#comment-16262805 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on issue #1330: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#issuecomment-346236125 @BryanCutler I am a bit reluctant to check `unit` and `timezone` in value holders because of performance reasons. This is currently not checked with other type with type params either, such as decimal. We should maybe visit this problem as a whole as follow up? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261939#comment-16261939 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on issue #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#issuecomment-346236125 @BryanCutler I am a bit reluctant to check `unit` and `timezone` if value holders because of performance reasons. This is currently not checked with other type with type params either, such as decimal. We should maybe visit this problem as a whole as follow up? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261941#comment-16261941 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on issue #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#issuecomment-346236125 @BryanCutler I am a bit reluctant to check `unit` and `timezone` i value holders because of performance reasons. This is currently not checked with other type with type params either, such as decimal. We should maybe visit this problem as a whole as follow up? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261726#comment-16261726 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on a change in pull request #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r152437974 ## File path: java/vector/src/main/codegen/templates/MapWriters.java ## @@ -242,7 +242,7 @@ public void end() { <#assign constructorParams = minor.arrowTypeConstructorParams /> <#else> <#assign constructorParams = [] /> - <#list minor.typeParams?reverse as typeParam> + <#list minor.typeParams as typeParam> Review comment: Making param order more sane.. It's driving me nuts we are flipping the ordering twice. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261723#comment-16261723 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on a change in pull request #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r152437699 ## File path: java/vector/src/main/java/org/apache/arrow/vector/types/TimeUnit.java ## @@ -19,10 +19,10 @@ package org.apache.arrow.vector.types; public enum TimeUnit { - SECOND(org.apache.arrow.flatbuf.TimeUnit.SECOND), - MILLISECOND(org.apache.arrow.flatbuf.TimeUnit.MILLISECOND), - MICROSECOND(org.apache.arrow.flatbuf.TimeUnit.MICROSECOND), - NANOSECOND(org.apache.arrow.flatbuf.TimeUnit.NANOSECOND); + SECOND(org.apache.arrow.flatbuf.TimeUnit.SECOND, java.util.concurrent.TimeUnit.SECONDS), + MILLISECOND(org.apache.arrow.flatbuf.TimeUnit.MILLISECOND, java.util.concurrent.TimeUnit.MILLISECONDS), + MICROSECOND(org.apache.arrow.flatbuf.TimeUnit.MICROSECOND, java.util.concurrent.TimeUnit.MICROSECONDS), + NANOSECOND(org.apache.arrow.flatbuf.TimeUnit.NANOSECOND, java.util.concurrent.TimeUnit.NANOSECONDS); Review comment: Not sure why that's better...Why do you think so? Maybe I missed sth. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261714#comment-16261714 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on a change in pull request #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r152436633 ## File path: java/vector/src/main/codegen/data/ValueVectorTypes.tdd ## @@ -116,7 +100,7 @@ { class: "Decimal", maxPrecisionDigits: 38, nDecimalDigits: 4, friendlyType: "BigDecimal", - typeParams: [ {name: "scale", type: "int"}, { name: "precision", type: "int"}], + typeParams: [ { name: "precision", type: "int"}, {name: "scale", type: "int"} ], Review comment: The reverse ordering the these typeParams are driving me nuts... I changed it such that they follow the same order everywhere. Also see: https://github.com/apache/arrow/pull/1330/files#r152436539 The end results are the same because flipped twice.. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261712#comment-16261712 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on a change in pull request #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r152436539 ## File path: java/vector/src/main/codegen/templates/FixedValueVectors.java ## @@ -56,7 +56,7 @@ private int allocationMonitor = 0; <#if minor.typeParams??> -<#assign typeParams = minor.typeParams?reverse /> +<#assign typeParams = minor.typeParams /> Review comment: Making type params order more sane... This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261704#comment-16261704 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on a change in pull request #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r152436102 ## File path: java/vector/src/main/codegen/data/ValueVectorTypes.tdd ## @@ -73,26 +73,10 @@ { class: "UInt8" }, { class: "Float8", javaType: "double", boxedType: "Double", fields: [{name: "value", type: "double"}] }, { class: "DateMilli",javaType: "long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampSec", javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampMilli", javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampMicro", javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampNano",javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampSecTZ", javaType: "long", boxedType: "Long", - typeParams: [ {name: "timezone", type: "String"} ], - arrowType: "org.apache.arrow.vector.types.pojo.ArrowType.Timestamp", - arrowTypeConstructorParams: ["org.apache.arrow.vector.types.TimeUnit.SECOND", "timezone"] }, -{ class: "TimeStampMilliTZ", javaType: "long", boxedType: "Long", - typeParams: [ {name: "timezone", type: "String"} ], - arrowType: "org.apache.arrow.vector.types.pojo.ArrowType.Timestamp", - arrowTypeConstructorParams: ["org.apache.arrow.vector.types.TimeUnit.MILLISECOND", "timezone"] }, -{ class: "TimeStampMicroTZ", javaType: "long", boxedType: "Long", - typeParams: [ {name: "timezone", type: "String"} ], - arrowType: "org.apache.arrow.vector.types.pojo.ArrowType.Timestamp", - arrowTypeConstructorParams: ["org.apache.arrow.vector.types.TimeUnit.MICROSECOND", "timezone"] }, -{ class: "TimeStampNanoTZ", javaType: "long", boxedType: "Long", - typeParams: [ {name: "timezone", type: "String"} ], - arrowType: "org.apache.arrow.vector.types.pojo.ArrowType.Timestamp", - arrowTypeConstructorParams: ["org.apache.arrow.vector.types.TimeUnit.NANOSECOND", "timezone"] }, +{ class: "Timestamp",javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" Review comment: It is used. The way it currently works is if timezone is specified, if will return the `LocalDateTime` in the specified time zone. For example: ``` value = 0, timezone = null ``` returns `LocalDateTime(1970-01-01 00:00:00)` ``` value = 0, timezone = "America/New_York" ``` returns `LocalDateTime(1969-12-31 19:00:00)` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261703#comment-16261703 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on a change in pull request #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r152436102 ## File path: java/vector/src/main/codegen/data/ValueVectorTypes.tdd ## @@ -73,26 +73,10 @@ { class: "UInt8" }, { class: "Float8", javaType: "double", boxedType: "Double", fields: [{name: "value", type: "double"}] }, { class: "DateMilli",javaType: "long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampSec", javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampMilli", javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampMicro", javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampNano",javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampSecTZ", javaType: "long", boxedType: "Long", - typeParams: [ {name: "timezone", type: "String"} ], - arrowType: "org.apache.arrow.vector.types.pojo.ArrowType.Timestamp", - arrowTypeConstructorParams: ["org.apache.arrow.vector.types.TimeUnit.SECOND", "timezone"] }, -{ class: "TimeStampMilliTZ", javaType: "long", boxedType: "Long", - typeParams: [ {name: "timezone", type: "String"} ], - arrowType: "org.apache.arrow.vector.types.pojo.ArrowType.Timestamp", - arrowTypeConstructorParams: ["org.apache.arrow.vector.types.TimeUnit.MILLISECOND", "timezone"] }, -{ class: "TimeStampMicroTZ", javaType: "long", boxedType: "Long", - typeParams: [ {name: "timezone", type: "String"} ], - arrowType: "org.apache.arrow.vector.types.pojo.ArrowType.Timestamp", - arrowTypeConstructorParams: ["org.apache.arrow.vector.types.TimeUnit.MICROSECOND", "timezone"] }, -{ class: "TimeStampNanoTZ", javaType: "long", boxedType: "Long", - typeParams: [ {name: "timezone", type: "String"} ], - arrowType: "org.apache.arrow.vector.types.pojo.ArrowType.Timestamp", - arrowTypeConstructorParams: ["org.apache.arrow.vector.types.TimeUnit.NANOSECOND", "timezone"] }, +{ class: "Timestamp",javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" Review comment: It is used. The way it currently works is if timezone is specified, if will just drop the timezone. For example: ``` value = 0, timezone = null ``` returns `LocalDateTime(1970-01-01 00:00:00)` ``` value = 0, timezone = "America/New_York" ``` returns `LocalDateTime(1969-12-31 19:00:00)` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261696#comment-16261696 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on a change in pull request #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r152435367 ## File path: java/vector/pom.xml ## @@ -135,6 +135,13 @@ org.apache.drill.tools drill-fmpp-maven-plugin 1.5.0 + + +org.freemarker +freemarker +2.3.23 + + Review comment: This should just be build dependency. Before it was using 2.3.21. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261694#comment-16261694 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on issue #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#issuecomment-346199159 @BryanCutler Thanks for the code review. I will clean this up. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261692#comment-16261692 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on issue #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#issuecomment-346198938 @jacques-n During the refactoring, I discovered that union vector doesn't support types with type params (i.e. Decimal and Timestamp), so I fixed that as well. Now union vectors support decimal as well. The PR now includes two major changes: * Consolidate all timestamp vectors into a single vector class * Add support for non-simple minor type (i.e., decimal, timestamp) in Union Most of the template change is to support types with type params, most of them replaces: ``` <#if !minor.typeParams?? > // handle types with no type params ``` to ``` <#if !minor.typeParams?? > // handle types with no type params <#else> // handle types with type params ``` Please take a look when you have chance. Thanks! Hopefully we can clean this up and merge the week after Thanksgiving. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261680#comment-16261680 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on a change in pull request #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r152433253 ## File path: java/vector/src/main/java/org/apache/arrow/vector/NullableTimestampVector.java ## @@ -18,30 +18,83 @@ package org.apache.arrow.vector; +import com.google.common.base.Preconditions; +import org.apache.arrow.vector.types.TimeUnit; +import org.joda.time.DateTimeZone; + import io.netty.buffer.ArrowBuf; import org.apache.arrow.memory.BufferAllocator; +import org.apache.arrow.vector.complex.impl.TimestampReaderImpl; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.holders.NullableTimestampHolder; +import org.apache.arrow.vector.holders.TimestampHolder; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; import org.apache.arrow.vector.types.pojo.FieldType; import org.apache.arrow.vector.util.TransferPair; +import org.joda.time.LocalDateTime; Review comment: We really need that checkstyle :) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261297#comment-16261297 ] ASF GitHub Bot commented on ARROW-1816: --- BryanCutler commented on a change in pull request #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r152366389 ## File path: java/vector/src/main/java/org/apache/arrow/vector/NullableTimestampVector.java ## @@ -18,30 +18,83 @@ package org.apache.arrow.vector; +import com.google.common.base.Preconditions; +import org.apache.arrow.vector.types.TimeUnit; +import org.joda.time.DateTimeZone; + import io.netty.buffer.ArrowBuf; import org.apache.arrow.memory.BufferAllocator; +import org.apache.arrow.vector.complex.impl.TimestampReaderImpl; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.holders.NullableTimestampHolder; +import org.apache.arrow.vector.holders.TimestampHolder; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; import org.apache.arrow.vector.types.pojo.FieldType; import org.apache.arrow.vector.util.TransferPair; +import org.joda.time.LocalDateTime; Review comment: minor: maybe move this import up with DateTimeZone, and move the TimeUnit down with other arrow This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261298#comment-16261298 ] ASF GitHub Bot commented on ARROW-1816: --- BryanCutler commented on a change in pull request #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r152363389 ## File path: java/vector/src/main/codegen/templates/MapWriters.java ## @@ -242,7 +242,7 @@ public void end() { <#assign constructorParams = minor.arrowTypeConstructorParams /> <#else> <#assign constructorParams = [] /> - <#list minor.typeParams?reverse as typeParam> + <#list minor.typeParams as typeParam> Review comment: Why change this? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261300#comment-16261300 ] ASF GitHub Bot commented on ARROW-1816: --- BryanCutler commented on a change in pull request #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r152365593 ## File path: java/vector/src/main/codegen/templates/UnionVector.java ## @@ -138,16 +138,20 @@ private void setReaderAndWriterIndex() { throw new UnsupportedOperationException("There are no inner vectors. Use geFieldBuffers"); } - private String fieldName(MinorType type) { -return type.name().toLowerCase(); + private String fieldName(ArrowType type) { +return Types.getMinorTypeForArrowType(type).name().toLowerCase(); } - private FieldType fieldType(MinorType type) { -return FieldType.nullable(type.getType()); - } + // private FieldType fieldType(MinorType type) { + // return FieldType.nullable(type.getType()); + // } + + // private T addOrGet(MinorType minorType, Class c) { + // return internalMap.addOrGet(fieldName(minorType), fieldType(minorType), c); + // } Review comment: forgot to remove? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261296#comment-16261296 ] ASF GitHub Bot commented on ARROW-1816: --- BryanCutler commented on a change in pull request #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r152362306 ## File path: java/vector/src/main/codegen/data/ValueVectorTypes.tdd ## @@ -73,26 +73,10 @@ { class: "UInt8" }, { class: "Float8", javaType: "double", boxedType: "Double", fields: [{name: "value", type: "double"}] }, { class: "DateMilli",javaType: "long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampSec", javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampMilli", javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampMicro", javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampNano",javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" }, -{ class: "TimeStampSecTZ", javaType: "long", boxedType: "Long", - typeParams: [ {name: "timezone", type: "String"} ], - arrowType: "org.apache.arrow.vector.types.pojo.ArrowType.Timestamp", - arrowTypeConstructorParams: ["org.apache.arrow.vector.types.TimeUnit.SECOND", "timezone"] }, -{ class: "TimeStampMilliTZ", javaType: "long", boxedType: "Long", - typeParams: [ {name: "timezone", type: "String"} ], - arrowType: "org.apache.arrow.vector.types.pojo.ArrowType.Timestamp", - arrowTypeConstructorParams: ["org.apache.arrow.vector.types.TimeUnit.MILLISECOND", "timezone"] }, -{ class: "TimeStampMicroTZ", javaType: "long", boxedType: "Long", - typeParams: [ {name: "timezone", type: "String"} ], - arrowType: "org.apache.arrow.vector.types.pojo.ArrowType.Timestamp", - arrowTypeConstructorParams: ["org.apache.arrow.vector.types.TimeUnit.MICROSECOND", "timezone"] }, -{ class: "TimeStampNanoTZ", javaType: "long", boxedType: "Long", - typeParams: [ {name: "timezone", type: "String"} ], - arrowType: "org.apache.arrow.vector.types.pojo.ArrowType.Timestamp", - arrowTypeConstructorParams: ["org.apache.arrow.vector.types.TimeUnit.NANOSECOND", "timezone"] }, +{ class: "Timestamp",javaType: "long", boxedType: "Long", friendlyType: "LocalDateTime" Review comment: Is the `friendlyType` param still used any more? If so then will `LocalDateTime` work with a timezone set? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261295#comment-16261295 ] ASF GitHub Bot commented on ARROW-1816: --- BryanCutler commented on a change in pull request #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r152361482 ## File path: java/vector/src/main/codegen/data/ValueVectorTypes.tdd ## @@ -116,7 +100,7 @@ { class: "Decimal", maxPrecisionDigits: 38, nDecimalDigits: 4, friendlyType: "BigDecimal", - typeParams: [ {name: "scale", type: "int"}, { name: "precision", type: "int"}], + typeParams: [ { name: "precision", type: "int"}, {name: "scale", type: "int"} ], Review comment: These were previously flipped in ARROW-1091 and then flipped again in ARROW-1092. I thought they were right already, are they not? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261302#comment-16261302 ] ASF GitHub Bot commented on ARROW-1816: --- BryanCutler commented on a change in pull request #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r152362623 ## File path: java/vector/pom.xml ## @@ -135,6 +135,13 @@ org.apache.drill.tools drill-fmpp-maven-plugin 1.5.0 + + +org.freemarker +freemarker +2.3.23 + + Review comment: How come this needs to be added now? If it does can it be scoped to compile phase? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261299#comment-16261299 ] ASF GitHub Bot commented on ARROW-1816: --- BryanCutler commented on a change in pull request #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r152370888 ## File path: java/vector/src/main/java/org/apache/arrow/vector/types/TimeUnit.java ## @@ -19,10 +19,10 @@ package org.apache.arrow.vector.types; public enum TimeUnit { - SECOND(org.apache.arrow.flatbuf.TimeUnit.SECOND), - MILLISECOND(org.apache.arrow.flatbuf.TimeUnit.MILLISECOND), - MICROSECOND(org.apache.arrow.flatbuf.TimeUnit.MICROSECOND), - NANOSECOND(org.apache.arrow.flatbuf.TimeUnit.NANOSECOND); + SECOND(org.apache.arrow.flatbuf.TimeUnit.SECOND, java.util.concurrent.TimeUnit.SECONDS), + MILLISECOND(org.apache.arrow.flatbuf.TimeUnit.MILLISECOND, java.util.concurrent.TimeUnit.MILLISECONDS), + MICROSECOND(org.apache.arrow.flatbuf.TimeUnit.MICROSECOND, java.util.concurrent.TimeUnit.MICROSECONDS), + NANOSECOND(org.apache.arrow.flatbuf.TimeUnit.NANOSECOND, java.util.concurrent.TimeUnit.NANOSECONDS); Review comment: Would it be better to leave this as it was and just put a switch statement in `getTimeUnit` to return the corresponding java TimeUnit? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261301#comment-16261301 ] ASF GitHub Bot commented on ARROW-1816: --- BryanCutler commented on a change in pull request #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#discussion_r152369556 ## File path: java/vector/src/main/java/org/apache/arrow/vector/NullableTimestampVector.java ## @@ -124,6 +177,32 @@ public void setSafe(int index, long value) { set(index, value); } + public void setSafe(int index, NullableTimestampHolder holder) { Review comment: Do these set methods need to check that the type params are compatible? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259467#comment-16259467 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on issue #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#issuecomment-345757388 @jacques-n I am fixing that part. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259428#comment-16259428 ] ASF GitHub Bot commented on ARROW-1816: --- jacques-n commented on issue #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#issuecomment-345747624 On first look this makes sense. However, if I'm reading this right, the reader/writer implementations have lost the ability to set the right unit. How does one do this? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259351#comment-16259351 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on issue #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#issuecomment-345726910 @wesm @jacques-n @siddharthteotia Timing wise I should be able to make this into 0.8 if we are happy about this approach. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258237#comment-16258237 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss commented on issue #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330#issuecomment-345474775 This PR is a RFC. I think the resulting timestamp vector (NullableTimestampVector) is still branch-free at the cell level. cc @jacques-n can you take a look and let me know what you think? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258235#comment-16258235 ] ASF GitHub Bot commented on ARROW-1816: --- icexelloss opened a new pull request #1330: wip: ARROW-1816: [Java] Resolve new vector classes structure for timestamp, date and maybe interval URL: https://github.com/apache/arrow/pull/1330 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Labels: pull-request-available > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16257279#comment-16257279 ] Li Jin commented on ARROW-1816: --- I will look at cell level a bit closer, my hunch is branching can be avoided by storing a timeUnit object in the vector: https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/NullableTimeStampNanoVector.java#L116 timeZone doesn't seem to be used in cell level either, but I need to look closer. > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ARROW-1816) [Java] Resolve new vector classes structure for timestamp, date and maybe interval
[ https://issues.apache.org/jira/browse/ARROW-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16257181#comment-16257181 ] Wes McKinney commented on ARROW-1816: - I think the issue here is to avoid branching at the array cell for Java. If there is a switch branch at the hot path for accessing values, then the JIT may generate worse code > [Java] Resolve new vector classes structure for timestamp, date and maybe > interval > -- > > Key: ARROW-1816 > URL: https://issues.apache.org/jira/browse/ARROW-1816 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Li Jin > Fix For: 0.8.0 > > > Personally I think having 8 vector classes for timestamps is not great. This > is discussed at some point during the PR: > https://github.com/apache/arrow/pull/1203#discussion_r145241388 -- This message was sent by Atlassian JIRA (v6.4.14#64029)