Hello, I'd like to know if there is any alternative to 'hadoop dfs -getmerge' over http. The closest I could find is the 'Download this file' link but this is available only for parts, not the whole directory ( http://hadoop:50075/streamFile?filename=%2Fuser%2Fhadoop-user%2Foutput%2Fsolr%2F%2Fpart-00000 )
What I'd like to do is push something from Hadoop to Solr. The options are: 1. run 'dsf -getmerge' which will get me a unified file (all the parts as one file) and scp that to the solr server , then run the actual push to solr. or 2. find a way to be able to provide this file (unified, not just parts of it) via http, so Solr (1.4) will be able to stream it. You could add a http server, but I see no point in adding apache to this mix, as long as there already is the hdfs browser running on 50070. I'd really like to go with 2, as it seems a lot easier. What do you think? thanks, alex