Write custom JSON from DataFrame in PySpark

2023-05-03 Thread Marco Costantini
Hello,

Let's say I have a very simple DataFrame, as below.

+---++
| id|datA|
+---++
|  1|  a1|
|  2|  a2|
|  3|  a3|
+---++

Let's say I have a requirement to write this to a bizarre JSON structure.
For example:

{
  "id": 1,
  "stuff": {
"datA": "a1"
  }
}

How can I achieve this with PySpark? I have only seen the following:
- writing the DataFrame as-is (doesn't meet requirement)
- using a UDF (seems frowned upon)

What I have tried is to do this within a `foreach`. I have had some
success, but also some problems with other requirements (serializing other
things).

Any advice? Please and thank you,
Marco.


How to create spark udf use functioncatalog?

2023-05-03 Thread tzxxh
We are using spark.Today I see the FunctionCatalog , and I have seen the
source of
spark\sql\core\src\test\scala\org\apache\spark\sql\connector\DataSourceV2FunctionSuite.scala
and have implements the ScalarFunction.But I still not know how to register
it in sql


unsubscribe

2023-05-03 Thread Kang
-- 


Best Regards!

Kang