nsivabalan commented on issue #5565:
URL: https://github.com/apache/hudi/issues/5565#issuecomment-1126350573
here is the sample script I tried w/ spark-datasource directly.
```
import org.apache.hudi.QuickstartUtils._
import scala.collection.JavaConversions._
import org.apache.spark.sql.SaveMode._
import org.apache.hudi.DataSourceReadOptions._
import org.apache.hudi.DataSourceWriteOptions._
import org.apache.hudi.config.HoodieWriteConfig._
val tableName = "hudi_trips_cow"
val basePath = "file:///tmp/hudi_trips_cow"
val dataGen = new DataGenerator
// spark-shell
val inserts = convertToStringList(dataGen.generateInserts(10))
val df = spark.read.json(spark.sparkContext.parallelize(inserts, 2))
import org.apache.spark.sql.functions.lit;
df.write.format("hudi").
options(getQuickstartWriteConfigs).
option(PRECOMBINE_FIELD_OPT_KEY, "ts").
option(RECORDKEY_FIELD_OPT_KEY, "uuid").
option(PARTITIONPATH_FIELD_OPT_KEY, "ppath").
option(TABLE_NAME, tableName).
option("hoodie.datasource.write.table.type","MERGE_ON_READ").
mode(Overwrite).
save(basePath)
df.write.format("hudi").
options(getQuickstartWriteConfigs).
option(PRECOMBINE_FIELD_OPT_KEY, "ts").
option(RECORDKEY_FIELD_OPT_KEY, "uuid").
option(PARTITIONPATH_FIELD_OPT_KEY, "ppath").
option(TABLE_NAME, tableName).
option("hoodie.datasource.write.table.type","MERGE_ON_READ").
option("hoodie.compact.schedule.inline","true").
option("hoodie.compact.inline.max.delta.commits","1").
mode(Append).
save(basePath)
// there should be 2 delta commits.
```
in another shell
```
ls -ltr /tmp/hudi_trips_cow/.hoodie/
total 40
drwxr-xr-x 2 nsb wheel 64 May 13 14:45 archived
-rw-r--r-- 1 nsb wheel 0 May 13 14:45
20220513144543292.deltacommit.requested
drwxr-xr-x 4 nsb wheel 128 May 13 14:45 metadata
-rw-r--r-- 1 nsb wheel 903 May 13 14:45 hoodie.properties
-rw-r--r-- 1 nsb wheel 1778 May 13 14:45
20220513144543292.deltacommit.inflight
-rw-r--r-- 1 nsb wheel 2993 May 13 14:45 20220513144543292.deltacommit
-rw-r--r-- 1 nsb wheel 0 May 13 14:46
20220513144616019.deltacommit.requested
-rw-r--r-- 1 nsb wheel 3123 May 13 14:46
20220513144616019.deltacommit.inflight
-rw-r--r-- 1 nsb wheel 3799 May 13 14:46 20220513144616019.deltacommit
```
now,lets try to schedule compaction.
continue w/ same spark shell as before
```
df.write.format("hudi").
| options(getQuickstartWriteConfigs).
| option(PRECOMBINE_FIELD_OPT_KEY, "ts").
| option(RECORDKEY_FIELD_OPT_KEY, "uuid").
| option(PARTITIONPATH_FIELD_OPT_KEY, "ppath").
| option(TABLE_NAME, tableName).
| option("hoodie.datasource.write.table.type","MERGE_ON_READ").
| option("hoodie.compact.schedule.inline","true").
| option("hoodie.compact.inline.max.delta.commits","1").
| mode(Append).
| save(basePath)
```
now lets list the .hoodie
```
ls -ltr /tmp/hudi_trips_cow/.hoodie/
total 64
drwxr-xr-x 2 nsb wheel 64 May 13 14:45 archived
-rw-r--r-- 1 nsb wheel 0 May 13 14:45
20220513144543292.deltacommit.requested
drwxr-xr-x 4 nsb wheel 128 May 13 14:45 metadata
-rw-r--r-- 1 nsb wheel 903 May 13 14:45 hoodie.properties
-rw-r--r-- 1 nsb wheel 1778 May 13 14:45
20220513144543292.deltacommit.inflight
-rw-r--r-- 1 nsb wheel 2993 May 13 14:45 20220513144543292.deltacommit
-rw-r--r-- 1 nsb wheel 0 May 13 14:46
20220513144616019.deltacommit.requested
-rw-r--r-- 1 nsb wheel 3123 May 13 14:46
20220513144616019.deltacommit.inflight
-rw-r--r-- 1 nsb wheel 3799 May 13 14:46 20220513144616019.deltacommit
-rw-r--r-- 1 nsb wheel 0 May 13 14:47
20220513144711915.deltacommit.requested
-rw-r--r-- 1 nsb wheel 3123 May 13 14:47
20220513144711915.deltacommit.inflight
-rw-r--r-- 1 nsb wheel 3953 May 13 14:47 20220513144711915.deltacommit
-rw-r--r-- 1 nsb wheel 1651 May 13 14:47
20220513144714578.compaction.requested
```
Check out the last entry. its compaction.requested.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]