[GitHub] cloud-fan commented on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write)
cloud-fan commented on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write) URL: https://github.com/apache/spark/pull/23208#issuecomment-454354966 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] cloud-fan commented on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write)
cloud-fan commented on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write) URL: https://github.com/apache/spark/pull/23208#issuecomment-454282941 Hi @rdblue , thanks for the review! It will be great to finish all the write operations soon, and adding overwrite is good as the next step! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] cloud-fan commented on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write)
cloud-fan commented on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write) URL: https://github.com/apache/spark/pull/23208#issuecomment-450009721 Hi @rdblue , thanks for looking at it! > deviations between the read and write side structures should be thought through and justified in the doc's text It has been added to the doc, please take another look when you have time. > Same with differences between the batch write and streaming write Isn't it the queryId discussion? queryId has been there for years and I think it's reasonable to make the design assuming it's still there. If there is a proposal to remove queryId, it should be done in parallel and we can adjust the v2 API accordingly after we decide to remove. Technically we should not block a PR just because some specific persons haven't reviewed it. I'd appreciate it if you give some suggestions to the doc/PR and I'll update them ASAP. But if you are busy, shall we let other people review/merge and move forward? We can always have followup PRs to address new comments, and even revert it if we find serious problems later on. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] cloud-fan commented on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write)
cloud-fan commented on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write) URL: https://github.com/apache/spark/pull/23208#issuecomment-448837287 > but it affects the API because it may change the structure of the builder or may be a reason to use a different pattern. According to the dicussion in the doc, seems you are challenging the current streaming engine in Spark. I don't think we should block the API refactor because of it. If we do need to change something fundamental in the streaming engine, that's a lot of work and it doesn't matter if the API refactor is done before or after it. > can you put the pseudocode for Spark's invocation sequence Sure, I'll follow the read side doc and put it in the write side doc This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org