Wahoo! Thanks Mark for saving me on this one! Anup, before this release, it would not have been pretty to pull that delta off! :-)
On Tue, May 5, 2015 at 11:39 AM, Mark Payne <[email protected]> wrote: > Anup, > With the 0.1.0 release that we are working on right now, there are two new > processors: ListHDFS, FetchHDFS, that are able to keep state about what has > been pulled from HDFS. This way you can keep the data in HDFS and still > only pull in new data. Will this help? > Thanks-Mark > > > From: [email protected] > > To: [email protected] > > Subject: RE: Fetch change list > > Date: Tue, 5 May 2015 15:32:07 +0000 > > > > Thanks Corey for that info. But the major problem I'm facing is I am > backing up a large set of data into HDFS (with a GetHDFS , source retained > as true) and then trying to fetch the delta from it. (get only the files > which have arrived recently by using the min Age and max Age). But I'm > unable to get the exact delta if I have 'keep source file' as true.. > > I played around a lot with schedule time and min & max age but didn't > help. > > > > -----Original Message----- > > From: Corey Flowers [mailto:[email protected]] > > Sent: Tuesday, May 05, 2015 5:35 PM > > To: [email protected] > > Subject: Re: Fetch change list > > > > Ok, the get file that is running, is basically causing a race condition > between all of the servers in your cluster. That is why you are seeing the > "NoSuchFile" error. If you change the scheduling strategy on that processor > to "On Primary node" Then the only system that will try to pick up data > from that mount point, is the server you have designated "primary node". > > This should fix that issue. > > > > On Mon, May 4, 2015 at 11:30 PM, Sethuram, Anup < > [email protected]> > > wrote: > > > > > Yes Corey, Right now the pickup directory is from a network share > > > mount point. The data is picked up from one location and transferred > > > to the other. I'm using site-to-site communication. > > > > > > -----Original Message----- > > > From: Corey Flowers [mailto:[email protected]] > > > Sent: Monday, May 04, 2015 7:57 PM > > > To: [email protected] > > > Subject: Re: Fetch change list > > > > > > Good morning Anup! > > > > > > Is the pickup directory coming from a network share mount > point? > > > > > > On Mon, May 4, 2015 at 10:11 AM, Sethuram, Anup > > > <[email protected] > > > > > > > wrote: > > > > > > > Hi , > > > > I'm trying to fetch a set of files which have > > > > recently changed in a "filesystem". Also I'm supposed to keep the > > > > original copy as it is. > > > > For obtaining the latest files that have changed, I'm using a > > > > PutFile with "replace" strategy piped to a GetFile with a minimum > > > > age of 5 sec, max file age of 30 sec, Keep source file as true, > > > > > > > > Also, running it in clustered mode. I'm seeing the below issues > > > > > > > > - The queue starts growing if there's an error. > > > > > > > > - Continuous errors with 'NoSuchFileException' > > > > > > > > - Penalizing StandardFlowFileErrors > > > > > > > > > > > > > > > > > > > > ERROR > > > > > > > > 0ab3b920-1f05-4f24-b861-4fded3d5d826 > > > > > > > > 161.91.234.248:7087 > > > > > > > > GetFile[id=0ab3b920-1f05-4f24-b861-4fded3d5d826] Failed to retrieve > > > > files due to > > > > org.apache.nifi.processor.exception.FlowFileAccessException: Failed > > > > to import data from /nifi/UNZ/log201403230000.log for > > > > StandardFlowFileRecord[uuid=f29bda59-8611-427c-b4d7-c921ee5e74b8,cla > > > > im =,offset=0,name=6908587554457536,size=0] > > > > due to java.nio.file.NoSuchFileException: > > > > /nifi/UNZ/log201403230000.log > > > > > > > > 18:45:56 IST > > > > > > > > > > > > > > > > 10:54:50 IST > > > > > > > > ERROR > > > > > > > > c552b5bc-f627-3cc3-b3d0-545c519eafd9 > > > > > > > > 161.91.234.248:6087 > > > > > > > > PutFile[id=c552b5bc-f627-3cc3-b3d0-545c519eafd9] Penalizing > > > > StandardFlowFileRecord[uuid=876e51f7-9a3d-4bf9-9d11-9073a5c950ad,cla > > > > im =1430717088883-73580,offset=0,name=file1.log,size=29314779] > > > > and transferring to failure due to > > > > org.apache.nifi.processor.exception.ProcessException: Could not > > > > rename > > > > /nifi/UNZ/.file1.log: > > > org.apache.nifi.processor.exception.ProcessException: > > > > Could not rename: /nifi/UNZ/.file1.log > > > > > > > > 10:54:56 IST > > > > > > > > ERROR > > > > > > > > 60662bb3-490a-3b47-9371-e11c12cdfa1a > > > > > > > > 161.91.234.248:7087 > > > > > > > > PutFile[id=60662bb3-490a-3b47-9371-e11c12cdfa1a] Penalizing > > > > StandardFlowFileRecord[uuid=522a2401-8269-4f0f-aff5-152d25cdcefa,cla > > > > im =1430717094668-73059,offset=1533296,name=file2.log,size=28014262] > > > > and transferring to failure due to > > > > org.apache.nifi.processor.exception.ProcessException: Could not > rename: > > > > /data/softwares/RS/nifi/OUT/.file2.log: > > > > org.apache.nifi.processor.exception.ProcessException: Could not > rename: > > > > /nifi/OUT/.file2.log > > > > > > > > > > > > > > > > Do I have to tweak the Run schedule or keep the same minimum file > > > > age and maximum file age to overcome this issue? > > > > What might be an elegant solution in NiFi? > > > > > > > > > > > > Thanks, > > > > anup > > > > > > > > ________________________________ > > > > The information contained in this message may be confidential and > > > > legally protected under applicable law. The message is intended > > > > solely for the addressee(s). If you are not the intended recipient, > > > > you are hereby notified that any use, forwarding, dissemination, or > > > > reproduction of this message is strictly prohibited and may be > > > > unlawful. If you are not the intended recipient, please contact the > > > > sender by return e-mail and destroy all copies of the original > message. > > > > > > > > > > > > > > > > -- > > > Corey Flowers > > > Vice President, Onyx Point, Inc > > > (410) 541-6699 > > > [email protected] > > > > > > -- This account not approved for unencrypted proprietary information > > > -- > > > > > > ________________________________ > > > The information contained in this message may be confidential and > > > legally protected under applicable law. The message is intended solely > > > for the addressee(s). If you are not the intended recipient, you are > > > hereby notified that any use, forwarding, dissemination, or > > > reproduction of this message is strictly prohibited and may be > > > unlawful. If you are not the intended recipient, please contact the > > > sender by return e-mail and destroy all copies of the original message. > > > > > > > > > > > -- > > Corey Flowers > > Vice President, Onyx Point, Inc > > (410) 541-6699 > > [email protected] > > > > -- This account not approved for unencrypted proprietary information -- > > > > ________________________________ > > The information contained in this message may be confidential and > legally protected under applicable law. The message is intended solely for > the addressee(s). If you are not the intended recipient, you are hereby > notified that any use, forwarding, dissemination, or reproduction of this > message is strictly prohibited and may be unlawful. If you are not the > intended recipient, please contact the sender by return e-mail and destroy > all copies of the original message. > > -- Corey Flowers Vice President, Onyx Point, Inc (410) 541-6699 [email protected] -- This account not approved for unencrypted proprietary information --
