> Since each sink really has just one thread driving them, adding multiple 
> sinks might help.

Oh hey, how does hdfs.threadsPoolSize relate to adding multiple sinks?  The 
docs say this is the

  Number of threads per HDFS sink for HDFS IO ops (open, write, etc.)

I've got 24 cores (12 + hyperthreading) on the machine I'm using to test this 
stuff.  I only see one under heavy use.  There are currently 98 flume threads 
running, and they are (relatively) spread out across all of the CPUs.  I'm 
starting to suspect that the source thread just can't keep up with all of the 
incoming UDP data, so it is dropping packets somewhere.  When this happens with 
another C program that we use to consume this stream internally, I see the 
'drops' counter increase for the port in /proc/<pid>/net/udp, but I am not 
seeing this happen now.

Is there a way to know if the JVM (or in this case Netty?) is dropping UDP 
packets?  As far as I can tell, Java's UDP interface is just a wrapper around 
the native UDP socket implementation, so there shouldn't be anything hidden 
here.  Or maybe there is some sneaky JVM/Netty buffering going on that I don't 
know about?


-Andrew

Reply via email to