Re: PipelineStageWriter doesn't work as expected

Jerome Boulon Fri, 18 Dec 2009 09:39:54 -0800

Hi Eric,
Can you create another class that takes a writer and make it a pipeline
writer? The logic for pipeline should be extracted and the current writers
should be kept clean.


I'm saying that because I have a new writer implementation and I would have
to do something similar to what you're doing for near real time monitoring.

Thanks,
  /Jerome.

On 12/18/09 9:16 AM, "Eric Yang" <[email protected]> wrote:

> Correction, the HDFS has been written to HDFS correctly.  Data were stuck at
> post data processing because the postProcess program crashed.  I still need
> to determine the cause of postProcess crash.  I think the modified
> SeqFileWriter does what I wanted, and I will implement next.add() to ensure
> the ordering can be interchanged.
> 
> Regards,
> Eric
> 
> On 12/18/09 8:59 AM, "Eric Yang" <[email protected]> wrote:
> 
>> I like to make a T on the incoming data.  One writer goes into HDFS, and
>> another writer enable real time pub/sub to monitor the data.  In my case,
>> the data are mirrored, not filtered.  However, I am not getting the right
>> result because it seems the data isn't getting written into HDFS regardless
>> the ordering of the writer.
>> 
>> Regards,
>> Eric
>> 
>> On 12/17/09 9:53 PM, "Ariel Rabkin" <[email protected]> wrote:
>> 
>>> What's the use case for this?
>>> 
>>> The original motivation for pipelined writers was so that we could do
>>> things like filtering before data got written.  Then it occurred to me
>>> that SocketTeeWriter fit fairly naturally into a pipeline.
>>> 
>>> Putting it "after" seq file writer wouldn't be too bad --
>>> SeqFileWriter.add() would need to call next.add().  But I would be
>>> hesitant to commit that change, without a really clear use case.
>>> 
>>> --Ari
>>> 
>>> On Thu, Dec 17, 2009 at 8:39 PM, Eric Yang <[email protected]> wrote:
>>>> It works fine after I put SocketTeeWriter first.  What needs to be
>>>> implemented in SeqFileWriter to be able to pipe correctly?
>>>> 
>>>> Regards,
>>>> Eric
>>>> 
>>>> On 12/17/09 5:26 PM, "[email protected]" <[email protected]> wrote:
>>>> 
>>>>> Put the SocketTeeWriter first.
>>>>> 
>>>>> sent from my iPhone; please excuse typos and brevity.
>>>>> 
>>>>> On Dec 17, 2009, at 8:12 PM, Eric Yang <[email protected]> wrote:
>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> I'd setup SocketTeeWriter by itself, and having data stream to next
>>>>>> socket
>>>>>> reader program.  When I tried to configure two writers, i.e.,
>>>>>> SeqFileWriter
>>>>>> follow by SocketTeeWriter.  It doesn't work because SeqFileWriter
>>>>>> isn't
>>>>>> extending PipelineableWriter.  I went ahead to extend SeqFileWriter as
>>>>>> PipelineableWriter and do that and implemented setNextStage method,
>>>>>> and
>>>>>> configured collector with:
>>>>>> 
>>>>>>  <property>
>>>>>>    <name>chukwaCollector.writerClass</name>
>>>>>> 
>>>>>> <value>
>>>>>> org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter</v
>>>>>> alue>
>>>>>>  </property>
>>>>>> 
>>>>>>  <property>
>>>>>>    <name>chukwaCollector.pipeline</name>
>>>>>> 
>>>>>> <value>
>>>>>> org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter,org.apac
>>>>>> he.hadoop.chukwa.datacollection.writer.SocketTeeWriter</value>
>>>>>>  </property>
>>>>>> 
>>>>>> SeqFileWriter writes the data correctly, but when connect to
>>>>>> SocketTeeWriter, there was no data visible in SocketTeeWriter.
>>>>>> Commands
>>>>>> works fine, but data streaming doesn't happen.  How do I configure the
>>>>>> collector and PipelineStageWriter to be able to write data into
>>>>>> multiple
>>>>>> writer?  Is there something on SeqFileWriter that could prevent this
>>>>>> from
>>>>>> working?
>>>>>> 
>>>>>> Regards,
>>>>>> Eric
>>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>

Re: PipelineStageWriter doesn't work as expected

Reply via email to