I am testing apache beam to read/ write xml files. But I encounter a problem
that even the code is just to read a single xml file and write it out without
doing any transformation, the process seems to hang indefinitely. The output
looks like below:
[pool-2-thread-5-ScalaTest-running-XmlSpec] INFO
org.apache.beam.sdk.io.FileBasedSource - Matched 1 files for pattern
/tmp/input/my.xml
[pool-6-thread-1] INFO org.apache.beam.sdk.io.Write - Initializing write
operation org.apache.beam.sdk.io.XmlSink$XmlWriteOperation@1c72df2c
The code basically do the following:
val options = PipelineOptionsFactory.create
val p = Pipeline.create(options)
val xml = XmlSource.from[Record](new
File("/tmp/input/my.xml").toPath.toString).withRootElement("RootE").withRecordElement("Record").withRecordClass(classOf[Record])
p.apply(Read.from(xml)).apply(Write.to(XmlSink.write().toFilenamePrefix("xml").ofRecordClass(classOf[Record]).withRootElement("RootE")))
p.run.waitUntilFinish
What part may be missing in my program?
Thanks