Hi, On Thu, Mar 20, 2014 at 3:29 AM, Phan, Truong Q < [email protected]> wrote:
> Damned if you do, damned if you don't. J > ;) we will get there eventually. All I meant that it seemed a very long message for what you seem to have summarized very well below. > All links mentioned in the below link are not working and displayed the > error message below. > > BTW, I am using Python not Java to code Avro. > > <snip> > > 1) Can I control the filename > http://wiki.apache.org/hadoop/FAQ#How_do_I_change_final_output_file_name_with_the_desired_name_rather_than_in_partitions_like_part-00000.2C_part-00001.3F > </snip> > > <snip> > > *An Exception Has Occurred* > > Unknown location: > /hadoop/core/trunk/src/mapred/org/apache/hadoop/mapred/TextOutputFormat.java > > *HTTP Response Status* > > 404 Not Found > > </snip> > I've now fixed this and you should be able to access the links no problem. > > > Here are my Avro/Python/MapReduce question/request: > > 1) If I am not using Hadoop's MapReduce Streaming then the Avro's > DataFileWriter method will write data into my "custom" filenames. However, > If I am using the Hadoop's MapReduce Streaming then the Avro's > DataFileWriter method will create an emptied files with the Hadoop's > default filenames (part-0000*) into the HDFS. Strangely, Avro's > DataFileWriter method will create an emptied files with Hadoop's default > filename (part-00000*). How dow I use Avro's DataFileWrite method in Python > to write data into my custom file name in HDFS? > OK so as I explained. You need to ensure that you create a class which implements OutputFormat enabling you to change the resulting file name. I think that you have no way of achieving this unless you code some. > 2) Do you have Python's sample codes to control the filename and > location to put our Avro's files into the HDFS? > Off the top of my head no I don't sorry. Maybe someone else can help you out here.
