Hi, For HDFS Sink we have 3 properties which determine the type and content that gets written to the file.
writeFomrat = text | writabe fileType = SequenceFile | DataStream | CompressedStream serializer = text | avro_event | <custom> Can one of the devs, explain these in detail, and the output expected by various permutation / combinations of the 3 values. and if any combination is invalid etc. e.g. what's the difference between the combo serializer = avro_event , fileType = SequenceFile and serializer = avro_event , fileType = DataStream , What's the difference between writeFormat = 'text' and writeFormat = 'writable' ? To give some background, I am looking to serialize Avro Events, in HDFS in Sequence file, and trying to use org.apache.avro.mapreduce.* from my hadoop jobs. I figure using SequenceFile should give better performance, over text, but I am not exactly sure of the various flume options I mentioned above. thanks
