(wayang-website) branch main updated: Added json-wayang-plan-guide (#111)

zkaoudi Thu, 08 Jan 2026 00:50:49 -0800

This is an automated email from the ASF dual-hosted git repository.

zkaoudi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/wayang-website.git



The following commit(s) were added to refs/heads/main by this push:
     new 101e37ad Added json-wayang-plan-guide (#111)
101e37ad is described below

commit 101e37ad327de2f96fb2366bc2c7c35cc8548662
Author: aleudework <[email protected]>
AuthorDate: Thu Jan 8 09:50:37 2026 +0100

    Added json-wayang-plan-guide (#111)
    
    Co-authored-by: aleude <[email protected]>
---
 docs/guide/json-wayang-plan-guide.md | 265 +++++++++++++++++++++++++++++++++++
 1 file changed, 265 insertions(+)

diff --git a/docs/guide/json-wayang-plan-guide.md 
b/docs/guide/json-wayang-plan-guide.md
new file mode 100644
index 00000000..05e8144c
--- /dev/null
+++ b/docs/guide/json-wayang-plan-guide.md
@@ -0,0 +1,265 @@
+# Documentation of Supported Operators in JSON-REST API
+
+This is not a full coverage of supported operators and plan parameters in the 
API. These are only the ones that we were able to verify ourselves working 
through testing, and that we allow in the agentic systems.
+
+For a Wayang plan in JSON to be executable, it must at least contain two main 
properties in its outer object: “context” and “operators”.
+
+## Overall Wayang Plan structure
+
+```json
+{
+  "context": { ... },
+  "operators": [ ... ]
+}
+```
+
+## Context Property
+
+The “context” property is an object with at least two fields: “platforms” and 
“configurations”. Platform is a list of strings with platforms available for 
the plan, like “Java” or “Spark”. At least one platform must be provided. 
Configurations contain optional settings provided to Wayang. The field is 
required, but it is possible to provide an empty object to it for the plan to 
work. In that case, we believe that Wayang uses the default configuration 
settings. An example of “context” below:
+
+```json
+{
+  "context": {
+    "platforms": ["java", "spark"],
+    "configuration": {}
+  }
+}
+```
+
+## Operator Property
+
+The “operator” property consists of a list of objects. Each object is an 
operator specification, and the list contains the available operators to be 
used in a plan. An operator has a set of required fields that it must contain 
to work. Some fields are common across all operators, where others are unique 
for the specific operator.
+
+Table 1 describes the required fields for most operators. Some operators 
require additional properties inside the “data” object, but these are listed in 
the operator schema in Table 2
+
+### Operator Properties in Apache Wayang Plans (Table 1)
+
+| Property | Value | Required in all operators | Description |
+|--------|------|---------------------------|-------------|
+| id | `Int` | Yes | The unique ID for an operator in a plan. |
+| cat | `String` | Yes | The operator's category or group, for example *input* 
or *unary*. |
+| input | `List[Int]` | Yes | Unique IDs of operators that send data to this 
operator. An operator is a source if the list is empty, for example input 
operators. |
+| output | `List[Int]` | Yes | Unique IDs of operators that receive data from 
this operator. An operator is a sink if the list is empty, for example output 
operators. |
+| operatorName | `String` | Yes | Name of the operator. |
+| data | `Object` | No | An object containing additional operator-specific 
properties. The *count* and *distinct* operators do not require the data 
property. |
+
+## Operator Schema
+
+The following Table 2 is a schema of all tested operators. Every operator 
requires the properties listed in Table 1. Data properties are 
operator-specific properties required. The schema is not a full coverage of the 
JSON-REST API. Additional operator and parameters are supported as well. These 
are the operators that we allow in the agentic systems.
+
+### Supported Apache Wayang Operators and their Required Data Properties 
(Table 2)
+
+| Category | Operator | Description | Required data properties | Example |
+|--------|---------|-------------|--------------------------|---------|
+| Input | jdbcRemoteInput | Reads data from a database using a JDBC 
connection. | uri, username, password, table, columnNames | jdbc_input |
+| Input | textFileInput | Reads data from a text file line by line. | filename 
| textfile_input |
+| Unary | map | Applies a function to each element and returns the transformed 
element. | udf | map |
+| Unary | flatMap | Applies a function that may return zero, one, or multiple 
elements. | udf | flatmap |
+| Unary | filter | Keeps only elements that satisfy a condition. | udf | 
filter |
+| Unary | distinct | Removes duplicate elements. | -- | distinct |
+| Unary | sort | Sorts elements according to a key. | keyUdf | sort |
+| Unary | sample | Randomly selects a subset of elements. | sampleSize | 
sample |
+| Unary | reduce | Aggregates all elements into a single result. | udf | 
reduce |
+| Unary | reduceBy | Aggregates elements with the same key. | keyUdf, udf | 
reduceby |
+| Unary | groupBy | Groups elements by key. | keyUdf | groupby |
+| Unary | count | Counts the number of elements. | -- | count |
+| Binary | join | Combines two datasets using an inner join. | thisKeyUdf, 
thatKeyUdf | join |
+| Output | textFileOutput | Writes output data to a text file. | filename | 
textfile_output |
+
+## Operator Examples
+
+### jdbcRemoteInput
+```json
+{
+  "id": 1,
+  "cat": "input",
+  "input": [],
+  "output": [2],
+  "operatorName": "jdbcRemoteInput",
+  "data": {
+    "uri": "jdbc:postgresql://localhost:5432/master_thesis_db",
+    "username": "master_thesis",
+    "password": "master",
+    "table": "person",
+    "columnNames": ["id", "name", "age", "address"]
+  }
+}
+```
+
+### textFileInput
+```json
+{
+  "id": 1,
+  "cat": "input",
+  "input": [],
+  "output": [2],
+  "operatorName": "textFileInput",
+  "data": {
+    "filename": "file:///Users/alexander/Downloads/textfile_input.txt"
+  }
+}
+```
+
+### map
+```json
+{
+  "id": 3,
+  "cat": "unary",
+  "input": [2],
+  "output": [],
+  "operatorName": "map",
+  "data": {
+    "udf": "(w: String) => w.length"
+  }
+}
+```
+
+### flatMap
+```json
+{
+  "id": 2,
+  "cat": "unary",
+  "input": [1],
+  "output": [3],
+  "operatorName": "flatMap",
+  "data": {
+    "udf": "(s: String) => s.split(\" \").toList"
+  }
+}
+```
+
+### filter
+```json
+{
+  "id": 3,
+  "cat": "unary",
+  "input": [2],
+  "output": [4],
+  "operatorName": "filter",
+  "data": {
+    "udf": "(w: String) => w.length > 4"
+  }
+}
+```
+
+### distinct
+```json
+{
+  "id": 2,
+  "cat": "unary",
+  "input": [1],
+  "output": [],
+  "operatorName": "distinct"
+}
+```
+
+### sort
+```json
+{
+  "id": 2,
+  "cat": "unary",
+  "input": [1],
+  "output": [],
+  "operatorName": "sort",
+  "data": {
+    "keyUdf": "(w: String) => w"
+  }
+}
+```
+
+### sample
+```json
+{
+  "id": 2,
+  "cat": "unary",
+  "input": [1],
+  "output": [3],
+  "operatorName": "sample",
+  "data": {
+    "sampleSize": 3
+  }
+}
+```
+
+### reduce
+```json
+{
+  "id": 2,
+  "cat": "unary",
+  "input": [1],
+  "output": [3],
+  "operatorName": "reduce",
+  "data": {
+    "udf": "(a: Int, b: Int) => a + b"
+  }
+}
+```
+
+### reduceBy
+```json
+{
+  "id": 4,
+  "cat": "unary",
+  "input": [3],
+  "output": [5],
+  "operatorName": "reduceBy",
+  "data": {
+    "keyUdf": "(pair: (String, Int)) => pair._1",
+    "udf": "(a: (String, Int), b: (String, Int)) => (a._1, a._2 + b._2)"
+  }
+}
+```
+
+### groupBy
+```json
+{
+  "id": 2,
+  "cat": "unary",
+  "input": [1],
+  "output": [3],
+  "operatorName": "groupBy",
+  "data": {
+    "keyUdf": "(w: String) => w.substring(0,1)"
+  }
+}
+```
+
+### count
+```json
+{
+  "id": 2,
+  "cat": "unary",
+  "input": [1],
+  "output": [],
+  "operatorName": "count"
+}
+```
+
+### join
+```json
+{
+  "id": 3,
+  "cat": "binary",
+  "input": [1, 2],
+  "output": [4],
+  "operatorName": "join",
+  "data": {
+    "thisKeyUdf": "(t: (String, Int)) => t._1",
+    "thatKeyUdf": "(t: (String, String)) => t._1"
+  }
+}
+```
+
+### textFileOutput
+```json
+{
+  "id": 4,
+  "cat": "output",
+  "input": [3],
+  "output": [],
+  "operatorName": "textFileOutput",
+  "data": {
+    "filename": "file:///Users/alexander/Downloads/testoutput1.txt"
+  }
+}
+```
\ No newline at end of file

(wayang-website) branch main updated: Added json-wayang-plan-guide (#111)

Reply via email to