[GitHub] [arrow-testing] westonpace commented on pull request #89: feat: data to test java substrait consumer

via GitHub Thu, 23 Mar 2023 06:17:01 -0700


westonpace commented on PR #89:
URL: https://github.com/apache/arrow-testing/pull/89#issuecomment-1481178615


   > @westonpace would it be useful to have more Acero/Substrait testing data 
like this in apache/arrow?
   
   We have a lot of hard-coded JSON but its embedded in the test files 
themselves (e.g. serde_test.cc or test_substrait.py) and not in standalone 
files.  The original concern around hard-coded JSON was that Substrait may 
evolve quickly and those JSON files would be difficult to maintain.  For 
example, the JSON files in this PR are missing the version field (Isthmus does 
not yet populate this) and they don't have URIs for the extension functions 
(almost no one generates these yet).  So they may need to change at some point.
   
   As a result, I have been waiting for the text format to be ready before I 
made any attempt to curate a large set of test queries (but that is still a few 
months off at least).
   
   I think SQL is probably a pretty good solution if you have a good 
SQL->Substrait library (that may be an advantage for Java).  In that case I 
would suggest only storing the SQL and then generating the Substrait on the fly.
   
   I don't actually know what the legal ramifications are for TPC-H but it is a 
good question.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-testing] westonpace commented on pull request #89: feat: data to test java substrait consumer

Reply via email to