Hi Eric, Can you create another class that takes a writer and make it a pipeline writer? The logic for pipeline should be extracted and the current writers should be kept clean.
I'm saying that because I have a new writer implementation and I would have to do something similar to what you're doing for near real time monitoring. Thanks, /Jerome. On 12/18/09 9:16 AM, "Eric Yang" <[email protected]> wrote: > Correction, the HDFS has been written to HDFS correctly. Data were stuck at > post data processing because the postProcess program crashed. I still need > to determine the cause of postProcess crash. I think the modified > SeqFileWriter does what I wanted, and I will implement next.add() to ensure > the ordering can be interchanged. > > Regards, > Eric > > On 12/18/09 8:59 AM, "Eric Yang" <[email protected]> wrote: > >> I like to make a T on the incoming data. One writer goes into HDFS, and >> another writer enable real time pub/sub to monitor the data. In my case, >> the data are mirrored, not filtered. However, I am not getting the right >> result because it seems the data isn't getting written into HDFS regardless >> the ordering of the writer. >> >> Regards, >> Eric >> >> On 12/17/09 9:53 PM, "Ariel Rabkin" <[email protected]> wrote: >> >>> What's the use case for this? >>> >>> The original motivation for pipelined writers was so that we could do >>> things like filtering before data got written. Then it occurred to me >>> that SocketTeeWriter fit fairly naturally into a pipeline. >>> >>> Putting it "after" seq file writer wouldn't be too bad -- >>> SeqFileWriter.add() would need to call next.add(). But I would be >>> hesitant to commit that change, without a really clear use case. >>> >>> --Ari >>> >>> On Thu, Dec 17, 2009 at 8:39 PM, Eric Yang <[email protected]> wrote: >>>> It works fine after I put SocketTeeWriter first. What needs to be >>>> implemented in SeqFileWriter to be able to pipe correctly? >>>> >>>> Regards, >>>> Eric >>>> >>>> On 12/17/09 5:26 PM, "[email protected]" <[email protected]> wrote: >>>> >>>>> Put the SocketTeeWriter first. >>>>> >>>>> sent from my iPhone; please excuse typos and brevity. >>>>> >>>>> On Dec 17, 2009, at 8:12 PM, Eric Yang <[email protected]> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I'd setup SocketTeeWriter by itself, and having data stream to next >>>>>> socket >>>>>> reader program. When I tried to configure two writers, i.e., >>>>>> SeqFileWriter >>>>>> follow by SocketTeeWriter. It doesn't work because SeqFileWriter >>>>>> isn't >>>>>> extending PipelineableWriter. I went ahead to extend SeqFileWriter as >>>>>> PipelineableWriter and do that and implemented setNextStage method, >>>>>> and >>>>>> configured collector with: >>>>>> >>>>>> <property> >>>>>> <name>chukwaCollector.writerClass</name> >>>>>> >>>>>> <value> >>>>>> org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter</v >>>>>> alue> >>>>>> </property> >>>>>> >>>>>> <property> >>>>>> <name>chukwaCollector.pipeline</name> >>>>>> >>>>>> <value> >>>>>> org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter,org.apac >>>>>> he.hadoop.chukwa.datacollection.writer.SocketTeeWriter</value> >>>>>> </property> >>>>>> >>>>>> SeqFileWriter writes the data correctly, but when connect to >>>>>> SocketTeeWriter, there was no data visible in SocketTeeWriter. >>>>>> Commands >>>>>> works fine, but data streaming doesn't happen. How do I configure the >>>>>> collector and PipelineStageWriter to be able to write data into >>>>>> multiple >>>>>> writer? Is there something on SeqFileWriter that could prevent this >>>>>> from >>>>>> working? >>>>>> >>>>>> Regards, >>>>>> Eric >>>>>> >>>> >>>> >>> >>> >> >
