Hello,
I need to develop an application which: - reads xml files in thousands of directories, two levels down, from year x to year y - extracts data from <image> tags in those files and stores them in a Sql or NoSql database - generates ImageMagick commands based on the extracted data to generate images - generates curl commands to index the image files with Solr Does Spark provide any tools/features to facilitate and automate ("batchify") the above tasks? I can do all of the above with one or several Java programs, but I wondered if using Spark would be of any use in such an endeavour. Many thanks. Philippe --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org