Thank you,Joe.

CompressContent works for me. I choose ‘decompress’ mode and ‘gzip’ compression 
format then it works like a charm!

[cid:[email protected]]

Roland.
From: Joe Witt [mailto:[email protected]]
Sent: Wednesday, August 12, 2015 10:38 AM
To: [email protected]
Subject: Re: UnpackContent processor cannot unpack gz file

Hello

The UnpackContent is for dealing with archive formats (tar, zip, etc..).

If your file is a compression format (as is the case with the part-0002.gz 
file) then you first need to run it through 'CompressContent' in 'decompress' 
mode.  You can even first run it through 'IdentifyMimeType' and set up a flow 
to handle arbitrarily complicated layers of compression/archive structures.

So for this case:

- GetHDFS (or ListHDFS and FetchHDFS)
- CompressContent (in decompress mode)

Now you have your text oriented file ready to be dealt with.  If you perhaps 
want to deal with each line individually you can use
- SplitText (line split count of 1)

Thanks
Joe

On Tue, Aug 11, 2015 at 9:27 PM, 彭光裕 
<[email protected]<mailto:[email protected]>> wrote:
[cid:[email protected]]
hi,
     I have a compressed file got from GetHDFS processor and to be unpacked by 
using UnpackContent processor, I have already set the UnpackContent processor 
property packaging format to 'tar', but an error like below always takes place.

The error logs is attached below (Unable to unpack StandardFlowFileRecord)

2015-08-11 07:10:52,291 ERROR [Timer-Driven Process Thread-4] 
o.a.n.processors.standard.UnpackContent 
UnpackContent[id=b90c65e1-b97f-3b4b-9e37-6223afa1ef03] Unable to unpack 
StandardFlowFileRecord[uuid=85b7d53b-3183-4c48-9160-b2e714b5eaa8,claim=1439248247840-1,offset=0,name=part-00002.gz,size=59212170]
 due to org.apache.nifi.processor.exception.ProcessException: IOException 
thrown from UnpackContent[id=b90c65e1-b97f-3b4b-9e37-6223afa1ef03]: 
java.io.IOException: Error detected parsing the header; routing to failure: 
org.apache.nifi.processor.exception.ProcessException: IOException thrown from 
UnpackContent[id=b90c65e1-b97f-3b4b-9e37-6223afa1ef03]: java.io.IOException: 
Error detected parsing the header

    My compressed file is named part-00002.gz, and you can access the file 
here: https://dl.dropboxusercontent.com/u/24808937/part-00002.gz
     Any advice would be welcome. Please help how to solve this problem, thank 
you!

Roland


本信件可能包含中華電信股份有限公司機密資訊,非指定之收件者,請勿蒐集、處理或利用本信件內容,並請銷毀此信件. 
如為指定收件者,應確實保護郵件中本公司之營業機密及個人資料,不得任意傳佈或揭露,並應自行確認本郵件之附檔與超連結之安全性,以共同善盡資訊安全與個資保護責任.
Please be advised that this email message (including any attachments) contains 
confidential information and may be legally privileged. If you are not the 
intended recipient, please destroy this message and all attachments from your 
system and do not further collect, process, or use them. Chunghwa Telecom and 
all its subsidiaries and associated companies shall not be liable for the 
improper or incomplete transmission of the information contained in this email 
nor for any delay in its receipt or damage to your system. If you are the 
intended recipient, please protect the confidential and/or personal information 
contained in this email with due care. Any unauthorized use, disclosure or 
distribution of this message in whole or in part is strictly prohibited. Also, 
please self-inspect attachments and hyperlinks contained in this email to 
ensure the information security and to protect personal information.

Reply via email to