1. be careful, HDFS are better for large files, not bunches of small files.
2. if that's really what you want, roll it your own.
def writeLines(iterator: Iterator[(String, String)]) = {
val writers = new mutalbe.HashMap[String, BufferedWriter] // (key, writer) map
try {
while
understand, thank you
small file is a problem, I am considering process data before put them in
hdfs.
On Tue, Aug 12, 2014 at 9:37 PM, Fengyun RAO raofeng...@gmail.com wrote:
1. be careful, HDFS are better for large files, not bunches of small files.
2. if that's really what you want, roll
hi,
I have googled and find similar question without good answer,
http://stackoverflow.com/questions/24520225/writing-to-hadoop-distributed-file-system-multiple-times-with-spark
in short, I would like to separate raw data and divide by some key, for
example, create date, and put the in directory