Wahoo! Thanks Mark for saving me on this one!

Anup, before this release, it would not have been pretty to pull that delta
off! :-)

On Tue, May 5, 2015 at 11:39 AM, Mark Payne <[email protected]> wrote:

> Anup,
> With the 0.1.0 release that we are working on right now, there are two new
> processors: ListHDFS, FetchHDFS, that are able to keep state about what has
> been pulled from HDFS. This way you can keep the data in HDFS and still
> only pull in new data. Will this help?
> Thanks-Mark
>
> > From: [email protected]
> > To: [email protected]
> > Subject: RE: Fetch change list
> > Date: Tue, 5 May 2015 15:32:07 +0000
> >
> > Thanks Corey for that info. But the major problem I'm facing is I am
> backing up a large set of data into HDFS (with a GetHDFS , source retained
> as true) and then trying to fetch the delta from it. (get only the files
> which have arrived recently by using the min Age and max Age). But I'm
> unable to get the exact delta if I have 'keep source file' as true..
> > I played around a lot with schedule time and min & max age but didn't
> help.
> >
> > -----Original Message-----
> > From: Corey Flowers [mailto:[email protected]]
> > Sent: Tuesday, May 05, 2015 5:35 PM
> > To: [email protected]
> > Subject: Re: Fetch change list
> >
> > Ok, the get file that is running, is basically causing a race condition
> between all of the servers in your cluster. That is why you are seeing the
> "NoSuchFile" error. If you change the scheduling strategy on that processor
> to "On Primary node" Then the only system that will try to pick up data
> from that mount point, is the server you have designated "primary node".
> > This should fix that issue.
> >
> > On Mon, May 4, 2015 at 11:30 PM, Sethuram, Anup <
> [email protected]>
> > wrote:
> >
> > > Yes Corey, Right now the pickup directory is from a network share
> > > mount point. The data is picked up from one location and transferred
> > > to the other. I'm using site-to-site communication.
> > >
> > > -----Original Message-----
> > > From: Corey Flowers [mailto:[email protected]]
> > > Sent: Monday, May 04, 2015 7:57 PM
> > > To: [email protected]
> > > Subject: Re: Fetch change list
> > >
> > > Good morning Anup!
> > >
> > >          Is the pickup directory coming from a network share mount
> point?
> > >
> > > On Mon, May 4, 2015 at 10:11 AM, Sethuram, Anup
> > > <[email protected]
> > > >
> > > wrote:
> > >
> > > > Hi ,
> > > >                 I'm trying to fetch a set of files which have
> > > > recently changed in a "filesystem". Also I'm supposed to keep the
> > > > original copy as it is.
> > > > For obtaining the latest files that have changed, I'm using a
> > > > PutFile with "replace" strategy piped to a GetFile with a minimum
> > > > age of 5 sec,  max file age of 30 sec, Keep source file as true,
> > > >
> > > > Also, running it in clustered mode. I'm seeing the below issues
> > > >
> > > > -          The queue starts growing if there's an error.
> > > >
> > > > -          Continuous errors with 'NoSuchFileException'
> > > >
> > > > -          Penalizing StandardFlowFileErrors
> > > >
> > > >
> > > >
> > > >
> > > > ERROR
> > > >
> > > > 0ab3b920-1f05-4f24-b861-4fded3d5d826
> > > >
> > > > 161.91.234.248:7087
> > > >
> > > > GetFile[id=0ab3b920-1f05-4f24-b861-4fded3d5d826] Failed to retrieve
> > > > files due to
> > > > org.apache.nifi.processor.exception.FlowFileAccessException: Failed
> > > > to import data from /nifi/UNZ/log201403230000.log for
> > > > StandardFlowFileRecord[uuid=f29bda59-8611-427c-b4d7-c921ee5e74b8,cla
> > > > im =,offset=0,name=6908587554457536,size=0]
> > > > due to java.nio.file.NoSuchFileException:
> > > > /nifi/UNZ/log201403230000.log
> > > >
> > > > 18:45:56 IST
> > > >
> > > >
> > > >
> > > > 10:54:50 IST
> > > >
> > > > ERROR
> > > >
> > > > c552b5bc-f627-3cc3-b3d0-545c519eafd9
> > > >
> > > > 161.91.234.248:6087
> > > >
> > > > PutFile[id=c552b5bc-f627-3cc3-b3d0-545c519eafd9] Penalizing
> > > > StandardFlowFileRecord[uuid=876e51f7-9a3d-4bf9-9d11-9073a5c950ad,cla
> > > > im =1430717088883-73580,offset=0,name=file1.log,size=29314779]
> > > > and transferring to failure due to
> > > > org.apache.nifi.processor.exception.ProcessException: Could not
> > > > rename
> > > > /nifi/UNZ/.file1.log:
> > > org.apache.nifi.processor.exception.ProcessException:
> > > > Could not rename: /nifi/UNZ/.file1.log
> > > >
> > > > 10:54:56 IST
> > > >
> > > > ERROR
> > > >
> > > > 60662bb3-490a-3b47-9371-e11c12cdfa1a
> > > >
> > > > 161.91.234.248:7087
> > > >
> > > > PutFile[id=60662bb3-490a-3b47-9371-e11c12cdfa1a] Penalizing
> > > > StandardFlowFileRecord[uuid=522a2401-8269-4f0f-aff5-152d25cdcefa,cla
> > > > im =1430717094668-73059,offset=1533296,name=file2.log,size=28014262]
> > > > and transferring to failure due to
> > > > org.apache.nifi.processor.exception.ProcessException: Could not
> rename:
> > > > /data/softwares/RS/nifi/OUT/.file2.log:
> > > > org.apache.nifi.processor.exception.ProcessException: Could not
> rename:
> > > > /nifi/OUT/.file2.log
> > > >
> > > >
> > > >
> > > > Do I have to tweak the Run schedule or keep the same minimum file
> > > > age and maximum file age to overcome this issue?
> > > > What might be an elegant solution in NiFi?
> > > >
> > > >
> > > > Thanks,
> > > > anup
> > > >
> > > > ________________________________
> > > > The information contained in this message may be confidential and
> > > > legally protected under applicable law. The message is intended
> > > > solely for the addressee(s). If you are not the intended recipient,
> > > > you are hereby notified that any use, forwarding, dissemination, or
> > > > reproduction of this message is strictly prohibited and may be
> > > > unlawful. If you are not the intended recipient, please contact the
> > > > sender by return e-mail and destroy all copies of the original
> message.
> > > >
> > >
> > >
> > >
> > > --
> > > Corey Flowers
> > > Vice President, Onyx Point, Inc
> > > (410) 541-6699
> > > [email protected]
> > >
> > > -- This account not approved for unencrypted proprietary information
> > > --
> > >
> > > ________________________________
> > > The information contained in this message may be confidential and
> > > legally protected under applicable law. The message is intended solely
> > > for the addressee(s). If you are not the intended recipient, you are
> > > hereby notified that any use, forwarding, dissemination, or
> > > reproduction of this message is strictly prohibited and may be
> > > unlawful. If you are not the intended recipient, please contact the
> > > sender by return e-mail and destroy all copies of the original message.
> > >
> >
> >
> >
> > --
> > Corey Flowers
> > Vice President, Onyx Point, Inc
> > (410) 541-6699
> > [email protected]
> >
> > -- This account not approved for unencrypted proprietary information --
> >
> > ________________________________
> > The information contained in this message may be confidential and
> legally protected under applicable law. The message is intended solely for
> the addressee(s). If you are not the intended recipient, you are hereby
> notified that any use, forwarding, dissemination, or reproduction of this
> message is strictly prohibited and may be unlawful. If you are not the
> intended recipient, please contact the sender by return e-mail and destroy
> all copies of the original message.
>
>



-- 
Corey Flowers
Vice President, Onyx Point, Inc
(410) 541-6699
[email protected]

-- This account not approved for unencrypted proprietary information --

Reply via email to