Hi Richard,

Can you share a little more info about your environment? Here's a
smattering of questions for which answers may be useful.

* What runner are you using?
* What version of the SDK?
* Does this reproduce in the DirectRunner?
* Can you share a full reproduction? (e.g., in a github gist)?
* What is happening on the machine(s) executing the job? Is there high CPU?
Is the disk active? Etc.

Thanks,
Dan

On Tue, Apr 4, 2017 at 9:33 AM, Richard Hanson <[email protected]> wrote:

> I am testing apache beam to read/ write xml files. But I encounter a
> problem that even the code is just to read a single xml file and write it
> out without doing any transformation, the process seems to hang
> indefinitely. The output looks like below:
>
> [pool-2-thread-5-ScalaTest-running-XmlSpec] INFO 
> org.apache.beam.sdk.io.FileBasedSource
> - Matched 1 files for pattern /tmp/input/my.xml
> [pool-6-thread-1] INFO org.apache.beam.sdk.io.Write - Initializing write
> operation org.apache.beam.sdk.io.XmlSink$XmlWriteOperation@1c72df2c
>
>
> The code basically do the following:
>
> val options = PipelineOptionsFactory.create
> val p = Pipeline.create(options)
>
>
> val xml = XmlSource.from[Record](new File("/tmp/input/my.xml").
> toPath.toString).withRootElement("RootE").withRecordElement("Record").
> withRecordClass(classOf[Record])
>
> p.apply(Read.from(xml)).apply(Write.to(XmlSink.write().
> toFilenamePrefix("xml").ofRecordClass(classOf[Record])
> .withRootElement("RootE")))
>
> p.run.waitUntilFinish
>
>
> What part may be missing in my program?
>
> Thanks
>

Reply via email to