[
https://issues.apache.org/jira/browse/FLUME-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yuyang Gong updated FLUME-3200:
-------------------------------
Description:
I found some .tmp file on my s3.
Sometimes I can find both two file with same content.I need remove the .tmp
file.
For example:
flume.1512259200732.txt.gz.tmp and flume.1512259200732.txt.gz
Sometimes I can only find a .tmp file, I need rename it manually.
This is my sink config:
agent.sinks.k1.type = hdfs
agent.sinks.k1.channel = c1
agent.sinks.k1.hdfs.path = s3a://mylogs/%Y-%m-%d/%H
agent.sinks.k1.hdfs.fileType = CompressedStream
agent.sinks.k1.hdfs.codeC = gzip
agent.sinks.k1.hdfs.filePrefix = flume
agent.sinks.k1.hdfs.fileSuffix = .txt.gz
agent.sinks.k1.hdfs.rollSize = 67108864
agent.sinks.k1.hdfs.rollInterval = 300
agent.sinks.k1.hdfs.rollCount = 100000
agent.sinks.k1.hdfs.batchSize = 1000
agent.sinks.k1.hdfs.useLocalTimeStamp = true
was:
I found some .tmp file on my s3.
Sometimes I can find both two file with same content.I need remove the .tmp
file.
For example:
flume.1512259200732.txt.gz.tmp and flume.1512259200732.txt.gz
Sometimes I can only find a .tmp file, I need rename it manually.
> Flume leaves .tmp files in HDFS(AWS s3) when rename timeout.
> ------------------------------------------------------------
>
> Key: FLUME-3200
> URL: https://issues.apache.org/jira/browse/FLUME-3200
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: 1.8.0
> Environment: Ubuntu (AWS EC2)
> Reporter: Yuyang Gong
>
> I found some .tmp file on my s3.
> Sometimes I can find both two file with same content.I need remove the .tmp
> file.
> For example:
> flume.1512259200732.txt.gz.tmp and flume.1512259200732.txt.gz
> Sometimes I can only find a .tmp file, I need rename it manually.
> This is my sink config:
> agent.sinks.k1.type = hdfs
> agent.sinks.k1.channel = c1
> agent.sinks.k1.hdfs.path = s3a://mylogs/%Y-%m-%d/%H
> agent.sinks.k1.hdfs.fileType = CompressedStream
> agent.sinks.k1.hdfs.codeC = gzip
> agent.sinks.k1.hdfs.filePrefix = flume
> agent.sinks.k1.hdfs.fileSuffix = .txt.gz
> agent.sinks.k1.hdfs.rollSize = 67108864
> agent.sinks.k1.hdfs.rollInterval = 300
> agent.sinks.k1.hdfs.rollCount = 100000
> agent.sinks.k1.hdfs.batchSize = 1000
> agent.sinks.k1.hdfs.useLocalTimeStamp = true
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)