Fantastic. So with this deserializer setting, it’s not dependent on the source being a logger type?
> On Aug 31, 2015, at 11:12 AM, iain wright <[email protected]> wrote: > > Hi Guyle, > > We ran into the same thing. > > Please see https://flume.apache.org/FlumeUserGuide.html#line > <https://flume.apache.org/FlumeUserGuide.html#line> > > On the originating source/where the event enters flume for the first time, > increase maxLineLength, ie: > ... > agent1.sources.source1.deserializer.maxLineLength = 1048576 > ... > > Best, > > -- > Iain Wright > > This email message is confidential, intended only for the recipient(s) named > above and may contain information that is privileged, exempt from disclosure > under applicable law. If you are not the intended recipient, do not disclose > or disseminate the message to anyone except the intended recipient. If you > have received this message in error, or are not the named recipient(s), > please immediately notify the sender by return email, and delete all copies > of this message. > > On Mon, Aug 31, 2015 at 11:03 AM, Guyle M. Taber <[email protected] > <mailto:[email protected]>> wrote: > I’m using an Avrosink to send events to HDFS and we’re seeing with long > content lines, our lines seem to be getting truncated at about the 2060 > character mark. How can I prevent long lines from being truncated when using > an Avro sink in this fashion? > > Here’s a snippet of an event from the raw logs before flume is involved. I’ve > toggled hidden characters so you can see the EOL character being inserted, > which breaks up the event into two lines. > > …utm_campaign=%E5%81%A5%E5%BA%B7%E7%BE%8E%E6%8A%A4&camp=%E5%81%A5%E5%BA%B7%E7%BE%8E%E6%8A%A4^Isearch-term[=]^Isession-id[=]720D69AB19F1DD17D27A948C9B31D380^Istore-id[=]^Itracking-ticket-id[=]^Itracking-ticket-number[=]^Ievent-session-id[=]98df4905-51ab-43a9-92d9-35d879a69b9a > $ > > Here’s a snippet of an event that gets truncated. > > …utm_campaign=%E5%81%A5%E5%BA%B7%E7%BE%8E%E6%8A%A4&camp=%E5%81%A5%E5%BA%$ > > B7%E7%BE%8E%E6%8A%A4^Isearch-term[=]^Isession-id[=]720D69AB19F1DD17D27A948C9B31D380^Istore-id[=]^Itracking-ticket-id[=]^Itracking-ticket-number[=]^Ievent-session-id[=]98df4905-51ab-43a9-92d9-35d879a69b9a > $ > > Here is our sink on the sending node. > > agent.sinks = AvroSink > agent.sinks.AvroSink.type = avro > agent.sinks.AvroSink.channel = memoryChannel > agent.sinks.AvroSink.hostname = flume.mydomain.int > <http://flume.mydomain.int/> > agent.sinks.AvroSink.port = 4169 > agent.sinks.AvroSink.batchSize = 0 > agent.sinks.AvroSink.rollSize = 0 > agent.sinks.AvroSink.rollInterval = 0 > agent.sinks.AvroSink.rollCount = 0 > agent.sinks.AvroSink.idleTimeout = 0 > agent.sinks.AvroSink.useLocalTimeStamp = true > > Here is our sink on the HDFS receiving side. > > dp1.sinks.sinkCN.type = hdfs > dp1.sinks.sinkCN.channel = channelCN > dp1.sinks.sinkCN.hdfs.filePrefix = %{basename}- > dp1.sinks.sinkCN.hdfs.path = > hdfs://sf1-hadoopnn1.mydomain.int/flume/events/ods/cn/fe_event/%{host}/%y-%m-%d > <> > dp1.sinks.sinkCN.hdfs.fileType = DataStream > dp1.sinks.sinkCN.hdfs.writeFormat = Text > dp1.sinks.sinkCN.hdfs.rollSize = 0 > dp1.sinks.sinkCN.hdfs.rollCount = 0 > dp1.sinks.sinkCN.hdfs.batchSize = 5000 >
