[GitHub] [drill] arina-ielchiieva commented on issue #1836: DRILL-7156: empty parquet files support
arina-ielchiieva commented on issue #1836: DRILL-7156: empty parquet files support URL: https://github.com/apache/drill/pull/1836#issuecomment-520813625 Good questions, you can investigate how now union types are handled. Regarding who wins, maybe you can look into `org.apache.drill.exec.physical.impl.union.UnionAllRecordBatch` to see how it Cretes combined schema using precedence rules. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] arina-ielchiieva commented on issue #1836: DRILL-7156: empty parquet files support
arina-ielchiieva commented on issue #1836: DRILL-7156: empty parquet files support URL: https://github.com/apache/drill/pull/1836#issuecomment-520794418 > Ignores all schemas except last while writing empty parquet file Please provide example... > Not support empty schemas (e.g. create table .. as select * from empty.json, e.g. {}) What behavior will be in this case? Failure? No-op? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] arina-ielchiieva commented on issue #1836: DRILL-7156: empty parquet files support
arina-ielchiieva commented on issue #1836: DRILL-7156: empty parquet files support URL: https://github.com/apache/drill/pull/1836#issuecomment-520794009 Regarding this comment > Questions: TestParquetWriterEmptyFiles#testMultipleWriters now creates several empty files, but not fails, since reading of empty parquet is supported. Should I rewrite comment or remove the test? I guess you can remove these tests and add new tests into `org.apache.drill.exec.store.parquet.TestEmptyParquet`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] arina-ielchiieva commented on issue #1836: DRILL-7156: empty parquet files support
arina-ielchiieva commented on issue #1836: DRILL-7156: empty parquet files support URL: https://github.com/apache/drill/pull/1836#issuecomment-520774175 @oleg-zinovev support empty parquet files reading is already merged into master (https://github.com/apache/drill/commit/4f4e1af53c9abccd1996f3b6841731e68768b48e). Do you plan on working on adding support for writing empty parquet files? We plan to include it in next Drill release (end of August / beginning of September). If yes, please factor out writing empty parquet and update the PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] arina-ielchiieva commented on issue #1836: DRILL-7156: empty parquet files support
arina-ielchiieva commented on issue #1836: DRILL-7156: empty parquet files support URL: https://github.com/apache/drill/pull/1836#issuecomment-518187400 @oleg-zinovev thanks, I have assigned https://issues.apache.org/jira/browse/DRILL-7156 to you. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] arina-ielchiieva commented on issue #1836: DRILL-7156: empty parquet files support
arina-ielchiieva commented on issue #1836: DRILL-7156: empty parquet files support URL: https://github.com/apache/drill/pull/1836#issuecomment-518183790 @oleg-zinovev thanks for making the changes, though the situation is little bit awkward, since I was working on similar changes and did not know you intend to do them as well (https://issues.apache.org/jira/browse/DRILL-4517). Though I was working on reading empty parquet files but not writing them. I suggest you separate out writing empty parquet files into separate PR as for reading it might be better if my changes will be used instead: first you change metadata cache files and this would affect backward compatibility as well as will have to store more information than needed, secondly your changes does not seem to optimize reading complex types. What do you think? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services