Might be worth trying the debug output (I forget exact sink name) to just log the headers being attached to events after the interceptor to validate the regex is working correctly, and for all events.
I setup this exact config at previous company so I know it works. I also remember needing to escape the regex in an odd way due to how java was loading/parsing the config Best, Iain Sent from my iPhone > On Jan 13, 2017, at 12:00 PM, Justin Workman <[email protected]> wrote: > > Absolutey, see below. Just to reiterate, when using the timestamp interceptor > values to build the output path based on timestamp in the flume header, > things roll correct. The files also roll just fine base on file size as well. > However when using the regex_interceptor to get the actual events timestamp > to use in the output path, the last file in each directory does not ever > rename/close until flume is restarted. > > > flume-conf.properties > agent1.sources = fpssKafkaTopic > agent1.channels = fpssHdfsFileChannel > agent1.sinks = fpssHdfsSink > > agent1.sources.fpssKafkaTopic.type = org.apache.flume.source.kafka.KafkaSource > agent1.sources.fpssKafkaTopic.zookeeperConnect = zk-host:2181 > agent1.sources.fpssKafkaTopic.topic = first-pass-stream-sessionized > agent1.sources.fpssKafkaTopic.groupId = flume-first-pass-stream-sessionized > agent1.sources.fpssKafkaTopic.kafka.auto.offset.reset = smallest > agent1.sources.fpssKafkaTopic.channels = fpssHdfsFileChannel > agent1.sources.fpssKafkaTopic.interceptors = i1 i2 i3 > agent1.sources.fpssKafkaTopic.interceptors.i1.type = timestamp > agent1.sources.fpssKafkaTopic.interceptors.i1.preserveExisting = false > agent1.sources.fpssKafkaTopic.interceptors.i2.type = > org.apache.flume.interceptor.HostInterceptor$Builder > agent1.sources.fpssKafkaTopic.interceptors.i2.hostHeader = hostname > agent1.sources.fpssKafkaTopic.interceptors.i2.useIP= false > agent1.sources.fpssKafkaTopic.interceptors.i2.preserveExisting = true > agent1.sources.fpssKafkaTopic.interceptors.i3.type = regex_extractor > agent1.sources.fpssKafkaTopic.interceptors.i3.regex = > ^.*\\"entryId\\":\\{\\"date\\":\\"(\\d\\d\\d\\d)-(\\d\\d)-(\\d\\d)T(\\d\\d):.*\\"\\}.*$ > agent1.sources.fpssKafkaTopic.interceptors.i3.serializers = s1 s2 s3 s4 > agent1.sources.fpssKafkaTopic.interceptors.i3.serializers.s1.name = year > agent1.sources.fpssKafkaTopic.interceptors.i3.serializers.s2.name = month > agent1.sources.fpssKafkaTopic.interceptors.i3.serializers.s3.name = day > agent1.sources.fpssKafkaTopic.interceptors.i3.serializers.s4.name = hour > agent1.sources.fpssKafkaTopic.kafka.consumer.timeout.ms = 100 > > agent1.channels.fpssHdfsFileChannel.type = file > agent1.channels.fpssHdfsFileChannel.checkpointDir = > /opt/flume/file-channel/fpss/checkpoint > agent1.channels.fpssHdfsFileChannel.dataDirs = > /opt/flume/file-channel/fpss/data > > agent1.sinks.fpssHdfsSink.type = hdfs > agent1.sinks.fpssHdfsSink.hdfs.filePrefix = %{hostname}-log > agent1.sinks.fpssHdfsSink.hdfs.inUseSuffix = .tmp > agent1.sinks.fpssHdfsSink.hdfs.path = > hdfs://prodcluster/flumedata/processed/first-pass-stream/%{year}/%{month}/%{day}/%{hour}-00 > agent1.sinks.fpssHdfsSink.hdfs.kerberosPrincipal = [email protected] > agent1.sinks.fpssHdfsSink.hdfs.kerberosKeytab = <keytab path removed for > privacy> > agent1.sinks.fpssHdfsSink.hdfs.rollInterval = 0 > agent1.sinks.fpssHdfsSink.hdfs.rollCount = 0 > ## Account for compression. See flume-2128 > ## My calculation: 512 * 1024 * 1024 * 2.75 > agent1.sinks.fpssHdfsSink.hdfs.rollSize = 1476395008 > # Close file if idle more than 300 seconds > agent1.sinks.hdfsSink.hdfs.idleTimeout = 300 > agent1.sinks.hdfsSink.hdfs.useLocalTimeStamp = true > agent1.sinks.fpssHdfsSink.hdfs.fileType = CompressedStream > agent1.sinks.fpssHdfsSink.hdfs.codeC = snappy > agent1.sinks.fpssHdfsSink.hdfs.writeFormat = Text > agent1.sinks.fpssHdfsSink.channel = fpssHdfsFileChannel > agent1.sinks.fpssHdfsSink.hdfs.batchSize = 10000 > agent1.sinks.fpssHdfsSink.hdfs.threadsPoolSize = 20 > agent1.sinks.fpssHdfsSink.hdfs.callTimeout = 20000 > > HDFS Output Since Midnight (Notice the last file is never closed/renamed) > hdfs dfs -ls /flumedata/processed/first-pass-stream/2017/01/13/*/ > 17/01/13 12:38:52 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > Found 7 items > -rw-r--r-- 3 b2c_runtime hadoop 513710580 2017-01-13 00:09 > /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.1484290815397.snappy > -rw-r--r-- 3 b2c_runtime hadoop 514439844 2017-01-13 00:18 > /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.1484290815398.snappy > -rw-r--r-- 3 b2c_runtime hadoop 515125962 2017-01-13 00:28 > /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.1484290815399.snappy > -rw-r--r-- 3 b2c_runtime hadoop 513010837 2017-01-13 00:38 > /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.1484290815400.snappy > -rw-r--r-- 3 b2c_runtime hadoop 511315467 2017-01-13 00:49 > /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.1484290815401.snappy > -rw-r--r-- 3 b2c_runtime hadoop 508420966 2017-01-13 00:59 > /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.1484290815402.snappy > -rw-r--r-- 3 b2c_runtime hadoop 2503353 2017-01-13 00:59 > /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.1484290815403.snappy.tmp > Found 6 items > -rw-r--r-- 3 b2c_runtime hadoop 509116221 2017-01-13 01:10 > /flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log.1484294415705.snappy > -rw-r--r-- 3 b2c_runtime hadoop 507800675 2017-01-13 01:21 > /flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log.1484294415706.snappy > -rw-r--r-- 3 b2c_runtime hadoop 504432110 2017-01-13 01:32 > /flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log.1484294415707.snappy > -rw-r--r-- 3 b2c_runtime hadoop 501932914 2017-01-13 01:42 > /flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log.1484294415708.snappy > -rw-r--r-- 3 b2c_runtime hadoop 498136257 2017-01-13 01:50 > /flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log.1484294415709.snappy > -rw-r--r-- 3 b2c_runtime hadoop 60539 2017-01-13 01:50 > /flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log.1484294415710.snappy.tmp > Found 6 items > -rw-r--r-- 3 b2c_runtime hadoop 500879399 2017-01-13 02:11 > /flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log.1484298016017.snappy > -rw-r--r-- 3 b2c_runtime hadoop 501827071 2017-01-13 02:21 > /flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log.1484298016018.snappy > -rw-r--r-- 3 b2c_runtime hadoop 501489101 2017-01-13 02:32 > /flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log.1484298016019.snappy > -rw-r--r-- 3 b2c_runtime hadoop 501527838 2017-01-13 02:43 > /flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log.1484298016020.snappy > -rw-r--r-- 3 b2c_runtime hadoop 499393977 2017-01-13 02:54 > /flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log.1484298016021.snappy > -rw-r--r-- 3 b2c_runtime hadoop 1282327 2017-01-13 02:54 > /flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log.1484298016022.snappy.tmp > Found 6 items > -rw-r--r-- 3 b2c_runtime hadoop 501033294 2017-01-13 03:10 > /flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log.1484301615579.snappy > -rw-r--r-- 3 b2c_runtime hadoop 500933906 2017-01-13 03:20 > /flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log.1484301615580.snappy > -rw-r--r-- 3 b2c_runtime hadoop 505869233 2017-01-13 03:31 > /flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log.1484301615581.snappy > -rw-r--r-- 3 b2c_runtime hadoop 502910608 2017-01-13 03:41 > /flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log.1484301615582.snappy > -rw-r--r-- 3 b2c_runtime hadoop 499561080 2017-01-13 03:52 > /flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log.1484301615583.snappy > -rw-r--r-- 3 b2c_runtime hadoop 3616826 2017-01-13 03:52 > /flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log.1484301615584.snappy.tmp > Found 6 items > -rw-r--r-- 3 b2c_runtime hadoop 502243204 2017-01-13 04:11 > /flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log.1484305215893.snappy > -rw-r--r-- 3 b2c_runtime hadoop 508966498 2017-01-13 04:22 > /flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log.1484305215894.snappy > -rw-r--r-- 3 b2c_runtime hadoop 510972236 2017-01-13 04:34 > /flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log.1484305215895.snappy > -rw-r--r-- 3 b2c_runtime hadoop 513225577 2017-01-13 04:46 > /flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log.1484305215896.snappy > -rw-r--r-- 3 b2c_runtime hadoop 512743679 2017-01-13 04:57 > /flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log.1484305215897.snappy > -rw-r--r-- 3 b2c_runtime hadoop 3888775 2017-01-13 04:57 > /flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log.1484305215898.snappy.tmp > Found 7 items > -rw-r--r-- 3 b2c_runtime hadoop 515832251 2017-01-13 05:11 > /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.1484308811983.snappy > -rw-r--r-- 3 b2c_runtime hadoop 518077964 2017-01-13 05:20 > /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.1484308811984.snappy > -rw-r--r-- 3 b2c_runtime hadoop 519490676 2017-01-13 05:29 > /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.1484308811985.snappy > -rw-r--r-- 3 b2c_runtime hadoop 519105563 2017-01-13 05:37 > /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.1484308811986.snappy > -rw-r--r-- 3 b2c_runtime hadoop 518672209 2017-01-13 05:46 > /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.1484308811987.snappy > -rw-r--r-- 3 b2c_runtime hadoop 520019853 2017-01-13 05:53 > /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.1484308811988.snappy > -rw-r--r-- 3 b2c_runtime hadoop 1574211 2017-01-13 05:53 > /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.1484308811989.snappy.tmp > Found 9 items > -rw-r--r-- 3 b2c_runtime hadoop 521428204 2017-01-13 06:07 > /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.1484312413743.snappy > -rw-r--r-- 3 b2c_runtime hadoop 519885769 2017-01-13 06:15 > /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.1484312413744.snappy > -rw-r--r-- 3 b2c_runtime hadoop 519050891 2017-01-13 06:21 > /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.1484312413745.snappy > -rw-r--r-- 3 b2c_runtime hadoop 520691322 2017-01-13 06:29 > /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.1484312413746.snappy > -rw-r--r-- 3 b2c_runtime hadoop 520902319 2017-01-13 06:36 > /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.1484312413747.snappy > -rw-r--r-- 3 b2c_runtime hadoop 520831873 2017-01-13 06:42 > /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.1484312413748.snappy > -rw-r--r-- 3 b2c_runtime hadoop 519785647 2017-01-13 06:49 > /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.1484312413749.snappy > -rw-r--r-- 3 b2c_runtime hadoop 520590143 2017-01-13 06:55 > /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.1484312413750.snappy > -rw-r--r-- 3 b2c_runtime hadoop 4621367 2017-01-13 06:55 > /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.1484312413751.snappy.tmp > Found 11 items > -rw-r--r-- 3 b2c_runtime hadoop 522623760 2017-01-13 07:06 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015214.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523065112 2017-01-13 07:12 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015215.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523445533 2017-01-13 07:18 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015216.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523084945 2017-01-13 07:24 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015217.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524283976 2017-01-13 07:30 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015218.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523923379 2017-01-13 07:36 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015219.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523910723 2017-01-13 07:42 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015220.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524266095 2017-01-13 07:47 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015221.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523002505 2017-01-13 07:53 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015222.snappy > -rw-r--r-- 3 b2c_runtime hadoop 520706211 2017-01-13 07:58 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015223.snappy > -rw-r--r-- 3 b2c_runtime hadoop 8051588 2017-01-13 07:58 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015224.snappy.tmp > Found 11 items > -rw-r--r-- 3 b2c_runtime hadoop 520528155 2017-01-13 08:05 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618433.snappy > -rw-r--r-- 3 b2c_runtime hadoop 521761390 2017-01-13 08:11 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618434.snappy > -rw-r--r-- 3 b2c_runtime hadoop 522548272 2017-01-13 08:16 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618435.snappy > -rw-r--r-- 3 b2c_runtime hadoop 522616117 2017-01-13 08:22 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618436.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525953759 2017-01-13 08:28 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618437.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524475009 2017-01-13 08:34 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618438.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523995339 2017-01-13 08:40 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618439.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524188832 2017-01-13 08:47 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618440.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525303001 2017-01-13 08:53 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618441.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525606532 2017-01-13 08:59 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618442.snappy > -rw-r--r-- 3 b2c_runtime hadoop 4486982 2017-01-13 08:59 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618443.snappy.tmp > Found 11 items > -rw-r--r-- 3 b2c_runtime hadoop 525207364 2017-01-13 09:06 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216987.snappy > -rw-r--r-- 3 b2c_runtime hadoop 526105891 2017-01-13 09:12 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216988.snappy > -rw-r--r-- 3 b2c_runtime hadoop 526426735 2017-01-13 09:18 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216989.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525298099 2017-01-13 09:24 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216990.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525282945 2017-01-13 09:30 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216991.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523921005 2017-01-13 09:36 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216992.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524827705 2017-01-13 09:42 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216993.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524203463 2017-01-13 09:47 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216994.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524678485 2017-01-13 09:53 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216995.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524598220 2017-01-13 09:59 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216996.snappy > -rw-r--r-- 3 b2c_runtime hadoop 3877959 2017-01-13 09:59 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216997.snappy.tmp > Found 10 items > -rw-r--r-- 3 b2c_runtime hadoop 523000460 2017-01-13 10:06 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813831.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523455154 2017-01-13 10:12 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813832.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525465618 2017-01-13 10:18 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813833.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524630955 2017-01-13 10:24 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813834.snappy > -rw-r--r-- 3 b2c_runtime hadoop 527780298 2017-01-13 10:30 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813835.snappy > -rw-r--r-- 3 b2c_runtime hadoop 526565562 2017-01-13 10:37 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813836.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524936336 2017-01-13 10:43 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813837.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524565610 2017-01-13 10:49 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813838.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524276950 2017-01-13 10:55 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813839.snappy > -rw-r--r-- 3 b2c_runtime hadoop 654810 2017-01-13 10:55 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813840.snappy.tmp > Found 11 items > -rw-r--r-- 3 b2c_runtime hadoop 524174553 2017-01-13 11:06 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415712.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524127864 2017-01-13 11:12 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415713.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524778919 2017-01-13 11:18 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415714.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524851182 2017-01-13 11:24 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415715.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525156750 2017-01-13 11:30 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415716.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525334538 2017-01-13 11:35 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415717.snappy > -rw-r--r-- 3 b2c_runtime hadoop 527346578 2017-01-13 11:41 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415718.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525592734 2017-01-13 11:47 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415719.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525502291 2017-01-13 11:53 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415720.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523135186 2017-01-13 11:58 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415721.snappy > -rw-r--r-- 3 b2c_runtime hadoop 9967141 2017-01-13 11:58 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415722.snappy.tmp > Found 7 items > -rw-r--r-- 3 b2c_runtime hadoop 520881970 2017-01-13 12:05 > /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.1484334016849.snappy > -rw-r--r-- 3 b2c_runtime hadoop 522340745 2017-01-13 12:11 > /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.1484334016850.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524156495 2017-01-13 12:17 > /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.1484334016851.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523482390 2017-01-13 12:23 > /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.1484334016852.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524096591 2017-01-13 12:29 > /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.1484334016853.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523184628 2017-01-13 12:35 > /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.1484334016854.snappy > -rw-r--r-- 3 b2c_runtime hadoop 10981218 2017-01-13 12:35 > /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.1484334016855.snappy.tmp > > HDFS Stat On One Of The File (Keep in Mind the output backet is based on > event time that is MDT/MST vs the stat date of GMT) > hadoop fs -stat "%y %n" > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100 > -log.1484326813840.snappy.tmp > 17/01/13 12:57:07 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 2017-01-13 17:55:35 flumeload100-log.1484326813840.snappy.tmp > > Thanks > Justin > >> On Thu, Jan 12, 2017 at 11:56 PM, Denes Arvay <[email protected]> wrote: >> Hi Justin, >> >> Could you please share your config file with us? >> >> Thanks, >> Denes >> >> >>> On Thu, Jan 12, 2017, 20:20 Justin Workman <[email protected]> wrote: >>> sorry for cross posting to user and dev. I have recently set up a flume >>> configuration where we are using the regex_extractor interceptor to parse >>> the actual event date from the record flowing through the Flume source, >>> then using that date to build the HDFS sink bucket path. However, it >>> appears that the hdfs.idleTimeout value is not honored in this >>> configuration. It does work when using the timestamp interceptor you build >>> the output path. >>> >>> I have set the hdfs.idleTimeout value for the HDFS sink, but the files are >>> never closed or renamed until I restart or shutdown Flume. Our flume is >>> configured to roll based on size or output path, and the files >>> rename/close/roll fine based on size, however the last file in each output >>> path is always left with the .tmp extension until we restart Flume. I would >>> expect that the file would be renamed and closed if there are no records >>> written to this file after the idleTimeout is reached. >>> >>> Could I be missing something, or is this a known bug with the regex_extract >>> interceptor? >>> >>> Thanks >>> Justin >
