[ 
https://issues.apache.org/jira/browse/FLUME-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Scarlatti updated FLUME-2795:
-----------------------------------
    Description: 
I have a hdfs sink with this config:

tier1.sinks.sink1.type         = hdfs
tier1.sinks.sink1.channel      = channel1
tier1.sinks.sink1.hdfs.path    = /user/bla/%y-%m-%d
tier1.sinks.sink1.hdfs.filePrefix =bla
tier1.sinks.sink1.hdfs.rollSize = 0
tier1.sinks.sink1.hdfs.rollInterval = 0
tier1.sinks.sink1.hdfs.rollCount = 150000
tier1.sinks.sink1.hdfs.useLocalTimeStamp = true
tier1.sinks.sink1.hdfs.fileType = DataStream
tier1.sinks.sink1.hdfs.batchSize = 100

every night at 23:59:59 a new folder is created in the HDFS and the folder for 
the previous day has a last file with .tmp extension, the file is incomplete 
and only when the flume agent is restarted this .tmp file is completed and 
closed an renamed.


  was:
I have a hdfs sink with this config:

tier1.sinks.sink1.type         = hdfs
tier1.sinks.sink1.channel      = channel1
tier1.sinks.sink1.hdfs.path    = /user/bla/%y-%m-%d
tier1.sinks.sink1.hdfs.filePrefix =bla
tier1.sinks.sink1.hdfs.rollSize = 0
tier1.sinks.sink1.hdfs.rollInterval = 0
tier1.sinks.sink1.hdfs.rollCount = 150000
tier1.sinks.sink1.hdfs.useLocalTimeStamp = true
tier1.sinks.sink1.hdfs.fileType = DataStream
tier1.sinks.sink1.hdfs.batchSize = 100

every night at 23:59:29 a new folder is created in the HDFS and the folder for 
the previous day has a last file with .tmp extension, the file is incomplete 
and only when the flume agent is restarted this .tmp file is completed and 
closed an renamed.



> Sinks with hdfs path with escape sequence do not close current .tmp file when 
> changit to new directory
> ------------------------------------------------------------------------------------------------------
>
>                 Key: FLUME-2795
>                 URL: https://issues.apache.org/jira/browse/FLUME-2795
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: v1.5.0
>         Environment: cdh5.4.4
>  over ubuntu
>            Reporter: David Scarlatti
>
> I have a hdfs sink with this config:
> tier1.sinks.sink1.type         = hdfs
> tier1.sinks.sink1.channel      = channel1
> tier1.sinks.sink1.hdfs.path    = /user/bla/%y-%m-%d
> tier1.sinks.sink1.hdfs.filePrefix =bla
> tier1.sinks.sink1.hdfs.rollSize = 0
> tier1.sinks.sink1.hdfs.rollInterval = 0
> tier1.sinks.sink1.hdfs.rollCount = 150000
> tier1.sinks.sink1.hdfs.useLocalTimeStamp = true
> tier1.sinks.sink1.hdfs.fileType = DataStream
> tier1.sinks.sink1.hdfs.batchSize = 100
> every night at 23:59:59 a new folder is created in the HDFS and the folder 
> for the previous day has a last file with .tmp extension, the file is 
> incomplete and only when the flume agent is restarted this .tmp file is 
> completed and closed an renamed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to