[GitHub] [arrow] zhztheplayer commented on pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-10-11 Thread GitBox
zhztheplayer commented on pull request #7030: URL: https://github.com/apache/arrow/pull/7030#issuecomment-706836000 > > The logic might warrant its own interface instead of a std::Function. > > I believe we already have this functionality in Gandiva. Let's see if we can use a

[GitHub] [arrow] zhztheplayer commented on pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-09-28 Thread GitBox
zhztheplayer commented on pull request #7030: URL: https://github.com/apache/arrow/pull/7030#issuecomment-700390598 Hi @Sebastiaan-Alvarez-Rodriguez , First of all, thanks for the detailed explanation about the issue and your suggestions. Sorry for the late reply. Actually I

[GitHub] [arrow] zhztheplayer commented on pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-08-23 Thread GitBox
zhztheplayer commented on pull request #7030: URL: https://github.com/apache/arrow/pull/7030#issuecomment-678839259 @emkornfield Sorry for the late reply. And yes I was planing to try adding a `Bits.java` based implementation to this PR, but I may not be able to working on it instantly

[GitHub] [arrow] zhztheplayer commented on pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-07-20 Thread GitBox
zhztheplayer commented on pull request #7030: URL: https://github.com/apache/arrow/pull/7030#issuecomment-660927861 Sorry I am reconsidering whether to use JVM direct memory as a limit of native buffer allocation. By Oracle's explanation[1] about direct memory, direct memory seems only to

[GitHub] [arrow] zhztheplayer commented on pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-07-15 Thread GitBox
zhztheplayer commented on pull request #7030: URL: https://github.com/apache/arrow/pull/7030#issuecomment-658605439 > I was suggesting a third approach: C++ memory pool that simply updates the direct memory info via Bits in java. It should be done chunk-wise to avoid excessive JNI

[GitHub] [arrow] zhztheplayer commented on pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-07-15 Thread GitBox
zhztheplayer commented on pull request #7030: URL: https://github.com/apache/arrow/pull/7030#issuecomment-658593410 @jacques-n Addressing a previous comment: > We need to avoid this entirely. If you need some functionality, let's figure out what should be exposed.

[GitHub] [arrow] zhztheplayer commented on pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-07-05 Thread GitBox
zhztheplayer commented on pull request #7030: URL: https://github.com/apache/arrow/pull/7030#issuecomment-654004147 @jacques-n I think we have several choices: we can try implement a c++ memory pool which is backed by a Java Allocator, actually I have a PoC branch for that:

[GitHub] [arrow] zhztheplayer commented on pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-07-05 Thread GitBox
zhztheplayer commented on pull request #7030: URL: https://github.com/apache/arrow/pull/7030#issuecomment-653997609 @rymurr You are right, schemas are not frequently transferred over JNI so it's more straightforward to use byte arrays in development. I think your suggestion makes sense

[GitHub] [arrow] zhztheplayer commented on pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-06-29 Thread GitBox
zhztheplayer commented on pull request #7030: URL: https://github.com/apache/arrow/pull/7030#issuecomment-651422684 Thanks for the comments! I've got some stuffs to deal with these days. Will address as soon as possible.

[GitHub] [arrow] zhztheplayer commented on pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-06-15 Thread GitBox
zhztheplayer commented on pull request #7030: URL: https://github.com/apache/arrow/pull/7030#issuecomment-644494295 @fsaintjacques Rebased now. As we have a unresolved [discussion](https://github.com/apache/arrow/pull/7030#discussion_r440075959) about memory management of native

[GitHub] [arrow] zhztheplayer commented on pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-06-10 Thread GitBox
zhztheplayer commented on pull request #7030: URL: https://github.com/apache/arrow/pull/7030#issuecomment-641936183 Hi @fsaintjacques, @emkornfield it would be much appreciated if you would like to continue reviewing this, here is a list for modifications since latest review comment:

[GitHub] [arrow] zhztheplayer commented on pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-05-19 Thread GitBox
zhztheplayer commented on pull request #7030: URL: https://github.com/apache/arrow/pull/7030#issuecomment-630624723 Hi @emkornfield @wesm, comments addressed and filter implementation is removed from this PR (I suppose we expect a separate topic about it). Any further changes needed

[GitHub] [arrow] zhztheplayer commented on pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-04-28 Thread GitBox
zhztheplayer commented on pull request #7030: URL: https://github.com/apache/arrow/pull/7030#issuecomment-620528808 Hi @wesm @emkornfield , based on the discussion in the [latest commit](https://github.com/apache/arrow/pull/7030/commits/7b16a41d0f9f3265f00c92ce50304b68ee03315f) I removed