[
https://issues.apache.org/jira/browse/FLUME-2818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14970585#comment-14970585
]
Kettler Karl commented on FLUME-2818:
-------------------------------------
Hello Gonzalo,
the problem is in HDP2.2 we can get Json but not in HDP2.3
We did not want to make too deep technical changes.
Is this the only possibility?
Kind regards,
Karl
> Problems with Avro data and not Json and no data in HDFS
> --------------------------------------------------------
>
> Key: FLUME-2818
> URL: https://issues.apache.org/jira/browse/FLUME-2818
> Project: Flume
> Issue Type: Request
> Components: Sinks+Sources
> Affects Versions: v1.5.2
> Environment: HDP-2.3.0.0-2557 Sandbox
> Reporter: Kettler Karl
> Priority: Critical
> Fix For: v1.5.2
>
>
> Flume supplies twitter data in avro format and not in Json.
> Why?
> Flume Config Agent:
> TwitterAgent.sources = Twitter
> TwitterAgent.channels = MemChannel
> TwitterAgent.sinks = HDFS
> TwitterAgent.sources.Twitter.type =
> org.apache.flume.source.twitter.TwitterSource
> TwitterAgent.sources.Twitter.channels = MemChannel
> TwitterAgent.sources.Twitter.consumerKey = xxx
> TwitterAgent.sources.Twitter.consumerSecret = xxx
> TwitterAgent.sources.Twitter.accessToken = xxx
> TwitterAgent.sources.Twitter.accessTokenSecret = xxx
> TwitterAgent.sources.Twitter.maxBatchSize = 10
> TwitterAgent.sources.Twitter.maxBatchDurationMillis = 200
> TwitterAgent.sources.Twitter.keywords = United Nations
> TwitterAgent.sources.Twitter.deserializer.schemaType = LITERAL
> # HDFS Sink
> TwitterAgent.sinks.HDFS.channel = MemChannel
> TwitterAgent.sinks.HDFS.type = hdfs
> TwitterAgent.sinks.HDFS.hdfs.path = /demo/tweets/stream/%y-%m-%d/%H%M%S
> TwitterAgent.sinks.HDFS.hdfs.filePrefix = events
> TwitterAgent.sinks.HDFS.hdfs.round = true
> TwitterAgent.sinks.HDFS.hdfs.roundValue = 5
> TwitterAgent.sinks.HDFS.hdfs.roundUnit = minute
> TwitterAgent.sinks.HDFS.hdfs.useLocalTimeStamp = true
> TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
> TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
> TwitterAgent.channels.MemChannel.type = memory
> TwitterAgent.channels.MemChannel.capacity = 1000
> TwitterAgent.channels.MemChannel.transactionCapacity = 100
> Twitter Data from Flume:
> Obj avro.schema�
> {"type":"record","name":"Doc","doc":"adoc","fields":[{"name":"id","type":"string"},{"name":"user_friends_count","type":["int","null"]},{"name":"user_location","type":["string","null"]},{"name":"user_description","type":["string","null"]},{"name":"user_statuses_count","type":["int","null"]},{"name":"user_followers_count","type":["int","null"]},{"name":"user_name","type":["string","null"]},{"name":"user_screen_name","type":["string","null"]},{"name":"created_at","type":["string","null"]},{"name":"text","type":["string","null"]},{"name":"retweet_count","type":["long","null"]},{"name":"retweeted","type":["boolean","null"]},{"name":"in_reply_to_user_id","type":["long","null"]},{"name":"source","type":["string","null"]},{"name":"in_reply_to_status_id","type":["long","null"]},{"name":"media_url_https","type":["string","null"]},{"name":"expanded_url","type":["string","null"]}]}�]3hˊى���|����$656461386520784896�
> �お絵描きするショタコン/オタクまっしぐら。論破メインに雑食もぐもぐ/成人済み pixiv:323565 隔離:【@yh_u_】�n� ユハズ
> yhzz_(2015-10-20T13:26:05Z� はじめた~リセマラめんどくさいし緑茶来たから普通にこのまま進める
> https://t.co/ZpfDqw4l9g � <a href=" http://twitter.com"
> rel="nofollow">Twitter Web Client</a> ^
> https://pbs.twimg.com/media/CRw4Js3UAAAGusn.pngthttp://twitter.com/yhzz_/status/656461386520784896/photo/1$656461390677417984�
> <Mundo de las sombras (Cc,Extr)�#RP User de un agente del gobierno |20| Que
> no me veais ni noteis mi presencia no quiere decir que no os este observando
> desde las sombras�� � JKP® BakasumaUserSinCausa(2015-10-20T13:26:06Z� RT
> @NaiiVicious: @Lisi_Hattori @UserSinCausa https://t.co/M2LTJWwqae � <a href="
> http://twitter.com/download/android" rel="nofollow">Twitter for Android</a> ^
> https://pbs.twimg.com/media/CRthC1mWUAIFTF-.jpg�
> http://twitter.com/NaiiVicious/status/656224896297529344/photo/1�]3hˊى���|���
> By loading this twitter data into a HDFS table. It is not possible to convert
> with avro-tools-1.7.7.jar. into Json. We get error message: "No data"
> If we want to read this file we get following error message:
> "java -jar avro-tools-1.7.7.jar tojson twitter.avro > twitter.json
> Exception in thread "main" org.apache.avro.AvroRuntimeException:
> java.io.EOFException"
> I hope you could help us.
> Kind regards,
> Karl
>
>
> Details
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)