LantaoJin opened a new pull request, #61:
URL: https://github.com/apache/datafusion-java/pull/61

   Mirror writeParquet's surface for newline-delimited JSON. JsonWriteOptions 
exposes singleFileOutput, partitionCols, and fileCompressionType; the 
DataFusion-side JsonOptions only carries compression in writer mode (the 
read-side toggles like newline_delimited and schema_infer_max_rec do not apply 
here).
   
   JsonOptions has no fluent setters, so the native handler builds it via 
struct-update syntax (same idiom as ArrowReadOptions / AvroReadOptions). 
Option<JsonOptions> stays None when no writer-side knob is set, so DataFusion's 
runtime defaults are preserved when callers pass new JsonWriteOptions().
   
   When the caller leaves singleFileOutput unset, default to directory output 
(with_single_file_output(false)) rather than DataFusion's Automatic mode. 
Automatic treats extension-bearing paths like "out.json" as single-file 
targets, which would silently contradict the documented "directory unless 
overridden" default.
   
   ## Which issue does this PR close?
   
   - Closes #39 .
   
   ## Rationale for this change
   
   `DataFrame.writeParquet` shipped in #27. JSON is the third writer 
DataFusion's `DataFrame` API exposes natively (`DataFrame::write_json`) and is 
the easiest format to consume from non-Arrow downstream tooling. The 
implementation follows the same proto-over-JNI pattern as the merged readers, 
mirrors the writer-side shape we'd land for CSV (#38), and has zero binary-size 
impact -- DataFusion's JSON support is in the default feature set, no Cargo 
flag changes required.
   
   ## What changes are included in this PR?
   
   - `proto/json_write_options.proto` -- new `JsonWriteOptionsProto` message
   - `JsonWriteOptions` Java builder
   - `Java_org_apache_datafusion_DataFrame_writeJsonWithOptions` JNI handler in 
`native/src/json.rs`
   
   ## Are these changes tested?
   
   Yes -- 9 new tests across `JsonWriteOptionsTest` and 
`DataFrameWriteJsonTest`.
   
   ## Are there any user-facing changes?
   
   Yes -- purely additive. New public API:
   
   - `org.apache.datafusion.JsonWriteOptions`
   - `DataFrame.writeJson(String)`
   - `DataFrame.writeJson(String, JsonWriteOptions)`
   
   The new `org.apache.datafusion.protobuf.JsonWriteOptionsProto` generated 
class is also exposed via the protobuf-Java output, consistent with how 
`CsvReadOptionsProto`, `NdJsonReadOptionsProto`, etc. are exposed. No API 
removals, no deprecations, no behavior change for existing callers. No Cargo 
feature changes; binary size is unchanged.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to