This is an automated email from the ASF dual-hosted git repository.
william pushed a commit to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/orc.git
The following commit(s) were added to refs/heads/branch-2.0 by this push:
new 2297b70d0 ORC-1647: Tips for supporting ORC in the `convert` command
2297b70d0 is described below
commit 2297b70d0d775edc6cefdf4c46c7ee5bba488879
Author: sychen <[email protected]>
AuthorDate: Sun Mar 10 15:05:39 2024 -0700
ORC-1647: Tips for supporting ORC in the `convert` command
### What changes were proposed in this pull request?
This PR aims to add tips for supporting ORC in the `convert` command.
### Why are the changes needed?
In the convert command, the source file format is supported to contain ORC,
but this is not mentioned in the tools and documentation.
### How was this patch tested?
local test
```bash
java -jar orc-tools-2.1.0-SNAPSHOT-uber.jar -h
```
Output
```
ORC Java Tools
usage: java -jar orc-tools-*.jar [--help] [--define X=Y] <command> <args>
Commands:
convert - convert CSV/JSON/ORC files to ORC
count - recursively find *.orc and print the number of rows
data - print the data from the ORC file
json-schema - scan JSON files to determine their schema
key - print information about the keys
meta - print the metadata about the ORC file
scan - scan the ORC file
sizes - list size on disk of each column
version - print the version of this ORC tool
To get more help, provide -h to the command
```
### Was this patch authored or co-authored using generative AI tooling?
No
Closes #1838 from cxzl25/ORC-1647.
Authored-by: sychen <[email protected]>
Signed-off-by: William Hyun <[email protected]>
(cherry picked from commit b894186401939b1ab8a0a9ab180f4d028973e12c)
Signed-off-by: William Hyun <[email protected]>
---
java/tools/src/java/org/apache/orc/tools/Driver.java | 2 +-
java/tools/src/java/org/apache/orc/tools/convert/ConvertTool.java | 2 +-
site/_docs/java-tools.md | 4 ++--
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/java/tools/src/java/org/apache/orc/tools/Driver.java
b/java/tools/src/java/org/apache/orc/tools/Driver.java
index 95e2d8728..5b993c2e9 100644
--- a/java/tools/src/java/org/apache/orc/tools/Driver.java
+++ b/java/tools/src/java/org/apache/orc/tools/Driver.java
@@ -86,7 +86,7 @@ public class Driver {
" [--define X=Y] <command> <args>");
System.err.println();
System.err.println("Commands:");
- System.err.println(" convert - convert CSV and JSON files to ORC");
+ System.err.println(" convert - convert CSV/JSON/ORC files to ORC");
System.err.println(" count - recursively find *.orc and print the
number of rows");
System.err.println(" data - print the data from the ORC file");
System.err.println(" json-schema - scan JSON files to determine their
schema");
diff --git a/java/tools/src/java/org/apache/orc/tools/convert/ConvertTool.java
b/java/tools/src/java/org/apache/orc/tools/convert/ConvertTool.java
index 19bead399..585a43fe6 100644
--- a/java/tools/src/java/org/apache/orc/tools/convert/ConvertTool.java
+++ b/java/tools/src/java/org/apache/orc/tools/convert/ConvertTool.java
@@ -44,7 +44,7 @@ import java.util.List;
import java.util.zip.GZIPInputStream;
/**
- * A conversion tool to convert CSV or JSON files into ORC files.
+ * A conversion tool to convert CSV, JSON OR ORC files into ORC files.
*/
public class ConvertTool {
static final String DEFAULT_TIMESTAMP_FORMAT =
diff --git a/site/_docs/java-tools.md b/site/_docs/java-tools.md
index 87c30af36..70a430929 100644
--- a/site/_docs/java-tools.md
+++ b/site/_docs/java-tools.md
@@ -11,7 +11,7 @@ supports both the local file system and HDFS.
The subcommands for the tools are:
- * convert (since ORC 1.4) - convert JSON/CSV files to ORC
+ * convert (since ORC 1.4) - convert CSV/JSON/ORC files to ORC
* count (since ORC 1.6) - recursively find *.orc and print the number of rows
* data - print the data of an ORC file
* json-schema (since ORC 1.4) - determine the schema of JSON documents
@@ -29,7 +29,7 @@ The command line looks like:
## Java Convert
-The convert command reads several JSON/CSV files and converts them into a
+The convert command reads several CSV/JSON/ORC files and converts them into a
single ORC file.
`-b,--bloomFilterColumns <columns>`