Hello,

I need to develop an application which:

- reads xml files in thousands of directories, two levels down, from year x to 
year y

- extracts data from <image> tags in those files and stores them in a Sql or 
NoSql database

- generates ImageMagick commands based on the extracted data to generate images

- generates curl commands to index the image files with Solr

Does Spark provide any tools/features to facilitate and automate ("batchify") 
the above tasks?

I can do all of the above with one or several Java programs, but I wondered if 
using Spark would be of any use in such an endeavour.

Many thanks.

Philippe
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to