[jira] [Commented] (ARROW-542) [Java] Implement dictionaries in stream/file encoding

2017-02-08 Thread Emilio Lahr-Vivaz (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858342#comment-15858342 ] Emilio Lahr-Vivaz commented on ARROW-542: - Ah, makes sense thanks. > [Java] Implement dictionar

[jira] [Commented] (ARROW-542) [Java] Implement dictionaries in stream/file encoding

2017-02-08 Thread Emilio Lahr-Vivaz (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858295#comment-15858295 ] Emilio Lahr-Vivaz commented on ARROW-542: - [~wesmckinn] I'm looking into how dictionary vectors

[jira] [Created] (ARROW-542) [Java] Implement dictionaries in stream/file encoding

2017-02-08 Thread Emilio Lahr-Vivaz (JIRA)
Emilio Lahr-Vivaz created ARROW-542: --- Summary: [Java] Implement dictionaries in stream/file encoding Key: ARROW-542 URL: https://issues.apache.org/jira/browse/ARROW-542 Project: Apache Arrow

Java dictionary encoding

2017-01-24 Thread Emilio Lahr-Vivaz
Hello, I'm interested in Java dictionary encoding (https://issues.apache.org/jira/browse/ARROW-366). Can I pick that up and start working on it? Anything I need to do first? Thanks, Emilio Lahr-Vivaz

Re: Java dictionary encoding

2017-01-25 Thread Emilio Lahr-Vivaz
what I did with the initial C++ implementation in https://github.com/apache/arrow/commit/74685f386307171a90a9f97316e25b 7f39cdd0a1#diff-708b00b9a46568e0fac8dcc1ac5f2749 If you need help feel free to ping us on the mailing list or JIRA. best Wes On Tue, Jan 24, 2017 at 12:39 PM, Emilio Lahr

[jira] [Commented] (ARROW-542) [Java] Implement dictionaries in stream/file encoding

2017-02-09 Thread Emilio Lahr-Vivaz (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15860292#comment-15860292 ] Emilio Lahr-Vivaz commented on ARROW-542: - Another blocker I'm hitting is that I don't see any way

[jira] [Commented] (ARROW-542) [Java] Implement dictionaries in stream/file encoding

2017-02-09 Thread Emilio Lahr-Vivaz (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859863#comment-15859863 ] Emilio Lahr-Vivaz commented on ARROW-542: - It's getting a little complicated trying to encode

[jira] [Created] (ARROW-691) [Java] Encode dictionary Int type in message format

2017-03-22 Thread Emilio Lahr-Vivaz (JIRA)
Emilio Lahr-Vivaz created ARROW-691: --- Summary: [Java] Encode dictionary Int type in message format Key: ARROW-691 URL: https://issues.apache.org/jira/browse/ARROW-691 Project: Apache Arrow

buffer alignment (format/java/js)

2017-08-08 Thread Emilio Lahr-Vivaz
I'm looking into buffer alignment in the java writer classes. Currently some files written with the java streaming writer can't be read due to the javascript TypedArray's restriction that the start offset of the array must be a multiple of the data size of the array type (i.e. Int32Vectors

Re: buffer alignment (format/java/js)

2017-08-08 Thread Emilio Lahr-Vivaz
After looking at it further, I think only the buffers themselves need to be aligned, not the metadata and/or schema. Would there be any problem with changing the alignment to 64 bytes then? Thanks, Emilio On 08/08/2017 08:08 AM, Emilio Lahr-Vivaz wrote: I'm looking into buffer alignment

Re: buffer alignment (format/java/js)

2017-08-08 Thread Emilio Lahr-Vivaz
tps://github.com/apache/arrow/blob/master/format/IPC.md that message sizes are expected to be a multiple of 8. We should also take a look at the File format implementation to ensure that padding is inserted after the magic number at the start of the file - Wes On Tue, Aug 8, 2017 at 1:32 PM, E

Re: buffer alignment (format/java/js)

2017-08-08 Thread Emilio Lahr-Vivaz
, Emilio On 08/08/2017 09:18 AM, Emilio Lahr-Vivaz wrote: Hi Wes, You're right, I just realized that. I think the alignment issue might be in some unrelated code, actually. From what I can tell the the arrow writers are aligning buffers correctly; if not I'll open a bug. Thanks, Emilio On 08/08

Re: buffer alignment (format/java/js)

2017-08-08 Thread Emilio Lahr-Vivaz
clarify? - Wes On Tue, Aug 8, 2017 at 8:52 AM, Emilio Lahr-Vivaz <elahrvi...@ccri.com> wrote: After looking at it further, I think only the buffers themselves need to be aligned, not the metadata and/or schema. Would there be any problem with changing the alignment to 64 bytes then?

[jira] [Created] (ARROW-1340) [Java] NullableMapVector field doesn't maintain metadata

2017-08-08 Thread Emilio Lahr-Vivaz (JIRA)
Emilio Lahr-Vivaz created ARROW-1340: Summary: [Java] NullableMapVector field doesn't maintain metadata Key: ARROW-1340 URL: https://issues.apache.org/jira/browse/ARROW-1340 Project: Apache Arrow

Re: buffer alignment (format/java/js)

2017-08-08 Thread Emilio Lahr-Vivaz
in a distributed fashion, and then > concatenating them in the streaming format. Can you show the code for this? On Tue, Aug 8, 2017 at 12:35 PM, Emilio Lahr-Vivaz <elahrvi...@ccri.com> wrote: So I think the issue is that we are serializing record batches in a distribute

[jira] [Created] (ARROW-1015) [Java] Implement schema-level metadata

2017-05-12 Thread Emilio Lahr-Vivaz (JIRA)
Emilio Lahr-Vivaz created ARROW-1015: Summary: [Java] Implement schema-level metadata Key: ARROW-1015 URL: https://issues.apache.org/jira/browse/ARROW-1015 Project: Apache Arrow Issue

[jira] [Created] (ARROW-997) [Java] Implement transfer in FixedSizeListVector

2017-05-10 Thread Emilio Lahr-Vivaz (JIRA)
Emilio Lahr-Vivaz created ARROW-997: --- Summary: [Java] Implement transfer in FixedSizeListVector Key: ARROW-997 URL: https://issues.apache.org/jira/browse/ARROW-997 Project: Apache Arrow

Re: [ANNOUNCE] Apache Arrow 0.4.0 released

2017-05-25 Thread Emilio Lahr-Vivaz
Congrats on the release! Is there a time frame for java artifacts being available on maven central? Thanks, Emilio On 05/23/2017 01:06 PM, Wes McKinney wrote: The Apache Arrow community is pleased to announce the 0.4.0 release. It includes 77 resolved issues ([1]) since the 0.3.0 release.

Re: arrow read/write examples in Java

2017-12-19 Thread Emilio Lahr-Vivaz
This has probably changed with the Java code refactor, but I've posted some answers inline, to the best of my understanding. Thanks, Emilio On 12/16/2017 12:17 PM, Animesh Trivedi wrote: Thanks Wes for you help. Based upon some code reading, I managed to code-up a basic working example. The

Re: [Java] org.apache.arrow.vector.ipc.ArrowWriter.recordBlocks

2018-04-30 Thread Emilio Lahr-Vivaz
From my time working on the arrow writers, I think that would be fine. You could do the same thing with the dictionary blocks, as well. As an implementation idea, it might be cleaner to add some callback hooks, i.e. onRecordBlockWritten(), and then implement that in the FileWriter instead of

Re: Correct way to set NULL values in VarCharVector (Java API)?

2018-04-11 Thread Emilio Lahr-Vivaz
Hi Atul, You should be able to use the overloaded 'set' method that takes a NullableVarCharHolder: https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/VarCharVector.java#L237 Thanks, Emilio On 04/10/2018 05:23 PM, Atul Dambalkar wrote: Hi, I

Re: [Discuss] If and how we should integrate geospatial data (specs) in Arrow

2021-06-25 Thread Emilio Lahr-Vivaz
Hello, Re: other projects, I'd like to point out the approach that we've taken on GeoMesa, a geospatial project that I work on. We model geometries in Arrow similarly to the GeoJSON spec[1], as lists of pairs of coordinates. We used FixedSizeList vectors of size 2 to represent each

Re: [Java] Problem with maven build in docker

2021-02-26 Thread Emilio Lahr-Vivaz
Hello, I think the issue is that current master is version 4.0.0-SNAPSHOT now, but your PR is 3.0.0-SNAPSHOT: https://github.com/apache/arrow/blob/master/java/format/pom.xml#L18 Thanks, Emilio On 2/26/21 4:58 AM, Fan Liya wrote: Dear all, In a recent PR [1], I have created a new