[GitHub] incubator-hawq issue #1353: HAWQ-1605. Support INSERT in PXF JDBC plugin
Github user denalex commented on the issue: https://github.com/apache/incubator-hawq/pull/1353 @sansanichfb -- can you please approve the PR if it looks good to you ? We are planning to commit it soon. ---
[GitHub] incubator-hawq issue #1353: HAWQ-1605. Support INSERT in PXF JDBC plugin
Github user leskin-in commented on the issue: https://github.com/apache/incubator-hawq/pull/1353 Hi @sansanichfb, You have mentioned some questions that were not replied. I am a bit confused; could you clarify what exactly is unclear or should be corrected? I have also left notes on each of the comments you've made in the code. ---
[GitHub] incubator-hawq issue #1353: HAWQ-1605. Support INSERT in PXF JDBC plugin
Github user sansanichfb commented on the issue: https://github.com/apache/incubator-hawq/pull/1353 Hi @kapustor , Sorry for the late reply. I don't have any new comments, there are couple old comments, which weren't replied. Apart from that - looks good. ---
[GitHub] incubator-hawq issue #1353: HAWQ-1605. Support INSERT in PXF JDBC plugin
Github user kapustor commented on the issue: https://github.com/apache/incubator-hawq/pull/1353 Hi @sansanichfb, Any news on this PR? ---
[GitHub] incubator-hawq issue #1353: HAWQ-1605. Support INSERT in PXF JDBC plugin
Github user kapustor commented on the issue: https://github.com/apache/incubator-hawq/pull/1353 Hi @sansanichfb We've removed the notification about the consistency and transactions from the error text. Thank you! ---
[GitHub] incubator-hawq issue #1353: HAWQ-1605. Support INSERT in PXF JDBC plugin
Github user sansanichfb commented on the issue: https://github.com/apache/incubator-hawq/pull/1353 @kapustor we can't fail safely when something wrong happens with at least one fragment, without dealing with the complexity of distributed transactions. So transaction handling doesn't look necessary in this PR. I would recommend to clearly state in the documents, that JDBC plugin doesn't guarantee a consistent outcome. Furthermore, we should set clear expectations for users and ask them to use staging tables for inserting data from Greenplum. It would allow us to simplify code and get rid of cumbersome rollback logic. WRT: > As for sending all the data in one batch - as I understand, such approach will limit the max data size by memory available for PXF service, because PXF have to aggregate the batch in its memory. - this is not exactly accurate since PXF is designed to be a lightweight layer between external storage and consumer thus no aggregation in memory is needed(except some reasonable buffering, depending on I/O and network sockets). ---
[GitHub] incubator-hawq issue #1353: HAWQ-1605. Support INSERT in PXF JDBC plugin
Github user kapustor commented on the issue: https://github.com/apache/incubator-hawq/pull/1353 Hi @sansanichfb! > Also, I am wondering, which target database patch was tested against? The transactional approach doesn't seem clear to me - if DB doesn't support transactions - you are still sending data in batches, which might leave target db in the non-consistent state since you can't rollback the last transaction. Should we reconsider this approach and send only one batch(not ideal though, maybe ask users to tweak partitioning) in that case? We have tested the patch with Oracle, Postgres, MSSQL and Ignite cluster targets - they all work fine, the perfomance is good enough (especially for Ignite - about 200 Mbyte/s for two-node Greenplum and Ignite clusters). If you know any other target that should be tested - pls let me know, I will do it. As for transactional approach - even if the target do support transactions, there still may be inconsistense in data, for example if one of the GP segements will fail to insert it's batch (other segements may succeed and only part of the data will be inserted). As for sending all the data in one batch - as I understand, such approach will limit the max data size by memory available for PXF service, because PXF have to aggregate the batch in its memory. ---
[GitHub] incubator-hawq issue #1353: HAWQ-1605. Support INSERT in PXF JDBC plugin
Github user sansanichfb commented on the issue: https://github.com/apache/incubator-hawq/pull/1353 Good job leskin-in! I left some comments. Also, I am wondering, which target database patch was tested against? The transactional approach doesn't seem clear to me - if DB doesn't support transactions - you are still sending data in batches, which might leave target db in the non-consistent state since you can't rollback the last transaction. Should we reconsider this approach and send only one batch(not ideal though, maybe ask users to tweak partitioning) in that case? ---
[GitHub] incubator-hawq issue #1353: HAWQ-1605. Support INSERT in PXF JDBC plugin
Github user kongyew commented on the issue: https://github.com/apache/incubator-hawq/pull/1353 **This JDBC profile is not officially supported** by Greenplum since it is not tested at all. Use it at your own risks. The interface maybe changed in the future since the current profile is not securely storing credentials. ---
[GitHub] incubator-hawq issue #1353: HAWQ-1605. Support INSERT in PXF JDBC plugin
Github user deem0n commented on the issue: https://github.com/apache/incubator-hawq/pull/1353 Yep, our customers have plenty different data sources, mostly JDBC compliant. This feature makes GPDB a very flexible and friendly player in corporate IT landscape. ---
[GitHub] incubator-hawq issue #1353: HAWQ-1605. Support INSERT in PXF JDBC plugin
Github user mebelousov commented on the issue: https://github.com/apache/incubator-hawq/pull/1353 This is great feature. We really need to upload data from Greenplum to other databases. ---