Hi
Ive seen this before. If you put a flume agent on a worker node that is running
a HDFS data node, and asusming you are using flume to write into HDFS, you will
find that the worker that has the flume agent on it will be the data node
chosen to house the (first replica of the) data. This may s
The following error occurs when your flume agent tries to write to the standby
NameNode:
"Operation category WRITE is not supported in state standby"
What failover mechanism are you using for your NameNodes?
--
Chris Horrocks
On Sat, Apr 1, 2017 at 11:31 am, hui@wbkit.com wrote
Hi Roberto,
Setting the roll intervals to 0 will stop the sink rolling the files in HDFS.
Try setting hdfs.rollCount to the number of messages you want to roll the file
on (I.e. The number of messages per file). Bare in mind setting this low will
result in higher HDFS overhead.
--
Chris
Hi,
Which version of Kafka are you using?
Off the top of my head it should be:
tier2.sources.source1.kafka.auto.offset.reset = earliest
Of course changing the group ID or if it's an older version of Kafka removing
the corresponding offset znode from zookeeper ought to do the trick
--
If the time stamp is passed as part of the flume event body you could extract
it via an interceptor and only pass the headers to Kafka.
--
Chris Horrocks
-- Chris Horrocks
On Mon, Sep 26, 2016 at 12:02 am, Kevin Tran <'kevin...@gmail.com'> wrote:
Hi,
Does anyone know ho
into Spark Streaming and keeping
flume as low overhead as possible, particularily if it's monitoring data that's
latency sensitive. For storing the calculations variables for consumption by
the interceptor I'd go with something like ZooKeeper.
--
Chris Horrocks
On Wed, Jul 27
In fact looking at your error the timeout looks like the hdfs.callTimeout, so
that's where I'd focus. Is your HDFS cluster particularily unperformant? 10s to
respond to a call is pretty slow.
--
Chris Horrocks
On Wed, Jul 20, 2016 at 9:25 am, Chris Horrocks <'chris@hor.r
You could look at tuning either hdfs.idleTimeout, hdfs.callTimeout, or
hdfs.retryInterval which can all be found at:
http://flume.apache.org/FlumeUserGuide.html#hdfs-sink
--
Chris Horrocks
On Wed, Jul 20, 2016 at 9:01 am, no jihun <'jees...@gmail.com'> wrote:
@ch
Have you tried increasing the HDFS sink timeouts?
--
Chris Horrocks
On Wed, Jul 20, 2016 at 8:03 am, no jihun <'jees...@gmail.com'> wrote:
Hi.
I found some files on hdfs left as OPEN_FOR_WRITE state.
This is flume's log about the file.
01 18 7 2016 16:12:02
eptor implementations
would allow you to conditionally bind certain events to a specific channel. I
suppose you could always write a custom interceptor to do it.
--
Chris Horrocks
From: Hai Thai
Reply: user@flume.apache.org
Date: 15 June 2016 at 22:40:07
To: user@flume.apache.org
Su
r use-case.
--
Chris Horrocks
From: Jason J. W. Williams
Reply: user@flume.apache.org
Date: 7 June 2016 at 19:03:59
To: user@flume.apache.org
Subject: Re: Kafka Sink random partition assignment
Thanks again Chris. I am curious why I see the round-robin behavior I expected
when using kaf
It's by design of Kafka (and by extension flume). The producers are designed to
be many-to-one (producers to partitions) and as such picking a random partition
every 10 minutes prevents separate producer instances from all randomly picking
the same partition.
--
Chris Horrocks
From:
ucer groups (yet) to ensure that producers
apportion the available partitions between them as this would create a
synchronisation issue between what should be entirely independent processes.
--
Chris Horrocks
On 7 June 2016 at 00:32:29, Jason J. W. Williams
(jasonjwwilli...@gmail.com
Are the permissions on the files the same? Does the user running the flume
agents have read permissions?
Are the files still being written to/locked open by another process?
Are there any logs being generated by the flume agent?
--
Chris Horrocks
On 20 April 2016 at 08:00:14, Saurabh Sharma
The Kafka channel would allow you to set event retention within Kafka.
> On 15 Apr 2016, at 12:54, Gonzalo Herreros wrote:
>
> That would depend on the channel.
> AFAIK, all the channels provided are FIFO without expiration but technically
> you could implement a channel that does that.
>
> Y
Does the hadoop slave have the HDFS client config & jars?
How are you deploying the flume agent? Are you using a hadoop distribution
manager like Cloudera Manager/Ambari/etc or is it a standalone instance?
> On 20 Mar 2016, at 15:12, Kartik Vashishta wrote:
>
> hadoop slave
signature.asc
De
rmat in flume
> hdfs sink setting. Try changing from
> hdfs.fileType SequenceFile to
>
> hdfs.fileType DataStream
>
> in your flume conf file.
>
> Inline image 1
>
>
> On Fri, Aug 29, 2014 at 8:39 PM, Chris Horrocks
> mailto:chrisjhorro.
e sink
> fine and is running without any problem.
>
>
> Let me know if you want to see jars in lib folder of my flume installation.
> ------
> From: Chris Horrocks
> Sent: 29-08-2014 20:16
> To: user@flume.apache.org
> Subject: Hadoop 2x Compata
Hi All,
I'm pretty new to Flume so forgive the newbish question, but I've been
working with Hadoop 2x for a little while.
I'm trying to configure flume (1.5.0) with a HDFS sink however the agent
won't start citing the following error:
29 Aug 2014 13:40:13,435 ERROR [conf-file-poller-0]
(org.apac
19 matches
Mail list logo