emaynardigs commented on issue #27155: [SPARK-17636][SPARK-25557][SQL] Parquet and ORC predicate pushdown in nested fields URL: https://github.com/apache/spark/pull/27155#issuecomment-573350984 > First of all, you must add @dbtsai 's authorship by add a commit with his authorship. > > The following is not a standard way to keep the authorship. > > > Firstly, much of this PR is a rebase of #22535, much thanks to @dbtsai for his work. > > Second, you need to address all the existing comment in the original PR. In the PR description, could you explain what is the improvement here from the original PR? If there is nothing new here, we had better close this one and asking @dbtsai to update his original PR. Hi # > First of all, you must add @dbtsai 's authorship by add a commit with his authorship. > > The following is not a standard way to keep the authorship. > > > Firstly, much of this PR is a rebase of #22535, much thanks to @dbtsai for his work. > > Second, you need to address all the existing comment in the original PR. In the PR description, could you explain what is the improvement here from the original PR? If there is nothing new here, we had better close this one and asking @dbtsai to update his original PR. Hi @dongjoon-hyun, thanks for the feedback! Actually, when viewing the original PR I was unaware that @dbtsai is a member and, in fact, your colleague. Your concern definitely makes sense. Firstly, I should say that there are actually no commits here under the original author's ownership; the code has diverged to the point now where, while I took some ideas & code from the original PR, it was easier to do all of this manually. I called this a "rebase", but it is a rebase only in the abstract sense that a lot of code was copied and updated for the latest master and not in the sense of any source control. Secondly, I'll try to elaborate on why I opened a new PR... 1. The original PR was abandoned with no activity or reply from the author in a year. The original author _was_ asked to update his PR and did not respond. 2. The original PR was flawed, failed several tests, and included some strange choices (like always splitting on a field that contained '.') that I did not agree with. 3. The original PR did not extend to ORC and only worked for Parquet. My company uses yet another binary format that the original PR would have been incompatible with, but the new one can work with. As you pointed out, I have addressed the comments in the original PR. I've extended the functionality to ORC as well as Parquet, tested the functionality myself, and have written more unit tests (largely copied from the original PR) and am currently writing more pending the approval of the basic code here. I would not say there is nothing new here.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
