Yes, the sink serializer is where you would serialize it. The Http/json can be 
used to send the event. This simply converts the json event into flume's own 
Event format. You can write a serializer that either knows the schema or reads 
it from configuration to parse the Flume event.  


Hari

-- 
Hari Shreedharan


On Thursday, November 8, 2012 at 1:34 PM, Bart Verwilst wrote:

> Would the sink serializer from 
> https://cwiki.apache.org/FLUME/flume-1x-event-serializers.html ( avro_event ) 
> by the right tool for the job? Probably not since i won't be able to send the 
> exact avro schema over the http/json link, and it will need conversion first. 
> I'm not a Java programmer though, so i think writing my own serializer would 
> be stretching it a bit. :(
>  
> Maybe i can use hadoop streaming to import my avro or something... :(
> Kind regards,
> Bart
>  
> Hari Shreedharan schreef op 08.11.2012 22:12:
> > Writing to avro files depends on how you serialize your data on the sink 
> > side, using a serializer. Note that JSON supports only UTF-8/16/32 
> > encoding, so if you want to send binary data you will need to write your 
> > own handler for that (you can use the JSON handler as an example) and 
> > configure the source to use that handler. Once the data is in Flume, just 
> > plug in your own serializer (which can take the byte array from the event 
> > and convert it into the schema you want) and write it out.
> >  
> >  
> > Thanks,
> > Hari
> >  
> > -- 
> > Hari Shreedharan
> >  
> > 
> > 
> > On Thursday, November 8, 2012 at 1:02 PM, Bart Verwilst wrote:
> > 
> > > Hi Hari,
> > >  
> > > Just to be absolutely sure, you can write to avro files by using this? If 
> > > so, I will try out a snapshot of 1.3 tomorrow and start playing with it. 
> > > ;)
> > >  
> > > Kind regards,
> > >  
> > > Bart
> > >  
> > >  
> > > Hari Shreedharan schreef op 08.11.2012 20:06:
> > > > No, I am talking about: 
> > > > https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=bc1928bc2e23293cb20f4bc2693a3bc262f507b3
> > > >  
> > > > This will be in the next release which will be out soon.
> > > >  
> > > >  
> > > > Thanks,
> > > > Hari
> > > >  
> > > > -- 
> > > > Hari Shreedharan
> > > >  
> > > > 
> > > > 
> > > > On Thursday, November 8, 2012 at 10:57 AM, Bart Verwilst wrote:
> > > > 
> > > > > Hi Hari,
> > > > > 
> > > > > Are you talking about ipc.HTTPTransciever ( 
> > > > > http://nullege.com/codes/search/avro.ipc.HTTPTransceiver )? This was 
> > > > > the class I tried before i noticed it wasn't supported by Flume-1.2 
> > > > > :) 
> > > > > I assume the http/json source will also allow for avro to be received?
> > > > >  
> > > > > Kind regards,
> > > > > Bart
> > > > >  
> > > > > Hari Shreedharan schreef op 08.11.2012 19:51:
> > > > > > The next release of Flume-1.3.0 adds support for an HTTP source, 
> > > > > > which will allow you to send data to Flume via HTTP/JSON(the 
> > > > > > representation of the data is pluggable - but a JSON representation 
> > > > > > is default). You could use this to write data to Flume from Python, 
> > > > > > which I believe has good http and json support.
> > > > > >  
> > > > > >  
> > > > > > Thanks,
> > > > > > Hari
> > > > > >  
> > > > > > -- 
> > > > > > Hari Shreedharan
> > > > > >  
> > > > > > 
> > > > > > 
> > > > > > On Thursday, November 8, 2012 at 10:45 AM, Bart Verwilst wrote:
> > > > > > 
> > > > > > > Hi,
> > > > > > >  
> > > > > > > I've been spending quite a few hours trying to push avro data to 
> > > > > > > Flume
> > > > > > > so i can store it on HDFS, this all with Python.
> > > > > > > It seems like something that is impossible for now, since the 
> > > > > > > only way
> > > > > > > to push avro data to Flume is by the use of deprecated thrift 
> > > > > > > binding
> > > > > > > that look pretty cumbersome to get working.
> > > > > > > I would like to know what's the best way to import avro data into 
> > > > > > > Flume
> > > > > > > with Python? Maybe Flume isnt the right tool and I should use 
> > > > > > > something
> > > > > > > else? My goal is to have multiple python workers pushing data to 
> > > > > > > HDFS
> > > > > > > which ( by means of Flume in this case ) consolidates this all in 
> > > > > > > 1 file
> > > > > > > there.
> > > > > > >  
> > > > > > > Any thoughts?
> > > > > > >  
> > > > > > > Thanks!
> > > > > > >  
> > > > > > > Bart
> > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > >  
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > 
> > > >  
> > > > 
> > > > 
> > > 
> > > 
> > > 
> > 
> >  
> > 
> 
> 
> 


Reply via email to