What is GW? On Tue, Feb 2, 2016 at 9:16 AM Pramod Immaneni <[email protected]> wrote:
> Good idea to handle it in GW. > > On Tue, Feb 2, 2016 at 8:50 AM, Thomas Weise <[email protected]> > wrote: > >> Exactly, this doesn't make sense. I filed an enhancement to have this in >> GW >> a while ago. >> >> On Tue, Feb 2, 2016 at 8:48 AM, Pramod Immaneni <[email protected]> >> wrote: >> >> > Yogi, >> > >> > kill is not an orderly shutdown, who will clean the state? >> > >> > On Tue, Feb 2, 2016 at 8:38 AM, Yogi Devendra <[email protected]> >> > wrote: >> > >> > > I would prefer to have an additional argument during application >> launch >> > on >> > > dtcli. >> > > >> > > Say, --preserve-kill-state true . >> > > >> > > Basically, platform should be able to do the clean-up activity if the >> > > application is invoked with certain flag. >> > > >> > > Test apps can set this flag to clear the data on kill. Production apps >> > can >> > > set this flag to keep the data on kill. >> > > >> > > Shutdown should always preserve the state. But, for kill / >> > forced-shutdown >> > > user might prefer to clear the state. >> > > >> > > ~ Yogi >> > > >> > > On 2 February 2016 at 21:53, Amol Kekre <[email protected]> wrote: >> > > >> > >> >> > >> Can we include a script in our github (util?) that simply deletes >> these >> > >> files upon application being killed, given an app-id. The admin will >> > need >> > >> to run this script. Auto-deleting will be bad as a lot of users, >> > including >> > >> those in production today need to restart using those files. The >> > >> knowledge/desire to restart post failure is outside the app and hence >> > >> technically the script should be explicitly user invoked >> > >> >> > >> Thks, >> > >> Amol >> > >> >> > >> >> > >> On Tue, Feb 2, 2016 at 6:12 AM, Pramod Immaneni < >> [email protected] >> > > >> > >> wrote: >> > >> >> > >>> Hi Venkat, >> > >>> >> > >>> There are typically a small number of outstanding checkpoint files >> per >> > >>> operator, as newer checkpoints are created old ones are >> automatically >> > >>> deleted by the application when it determines that state is no >> longer >> > >>> needed. When an application stops/killed the last checkpoints >> remain. >> > >>> There >> > >>> is also a benefit to that since a new application can be restarted >> to >> > >>> continue from those checkpoints instead of starting all the way from >> > the >> > >>> beginning and this is useful in some cases. But if you are always >> > >>> starting >> > >>> your application from scratch yes you can delete the checkpoints of >> > older >> > >>> applications that are no longer running. >> > >>> >> > >>> Thanks >> > >>> >> > >>> On Mon, Feb 1, 2016 at 10:19 PM, Kottapalli, Venkatesh < >> > >>> [email protected]> wrote: >> > >>> >> > >>> > Hi, >> > >>> > >> > >>> > Now that this has been discussed, Will the checkpointed >> data >> > be >> > >>> > purged when we kill the application forcefully? In our current >> > usage, >> > >>> we >> > >>> > forcefully kill the app after it processes a certain batch of >> data. I >> > >>> see >> > >>> > these small files are created under (user/datatorrent) directory >> and >> > >>> not >> > >>> > removed. >> > >>> > >> > >>> > Another scenario, when some of the containers keep >> failing, >> > we >> > >>> > have observed this state where the data is continuously >> checkpointed >> > >>> into >> > >>> > small files. When we kill the app, the data will be there. >> > >>> > >> > >>> > We have received concerns saying this is impacting >> namenode >> > >>> > performance since these small files are stored in HDFS. So we >> > manually >> > >>> > remove these checkpointed data at regular intervals. >> > >>> > >> > >>> > -Venkatesh >> > >>> > >> > >>> > -----Original Message----- >> > >>> > From: Amol Kekre [mailto:[email protected]] >> > >>> > Sent: Monday, February 01, 2016 7:49 AM >> > >>> > To: [email protected]; >> [email protected] >> > >>> > Subject: Re: Possibility of saving checkpoints on other >> distributed >> > >>> > filesystems >> > >>> > >> > >>> > Aniruddha, >> > >>> > We have not heard this request from users yet. It may be because >> our >> > >>> > checkpointing has a purge, i.e. the small files are not left over. >> > >>> Small >> > >>> > file problem has been there in Hadoop and relates to storing small >> > >>> files in >> > >>> > Hadoop for a longer time (more likely forever). >> > >>> > >> > >>> > Thks, >> > >>> > Amol >> > >>> > >> > >>> > >> > >>> > On Mon, Feb 1, 2016 at 6:05 AM, Aniruddha Thombare < >> > >>> > [email protected]> wrote: >> > >>> > >> > >>> > > Hi Community, >> > >>> > > >> > >>> > > Or Let me say BigFoots, do you think this feature should be >> > >>> available? >> > >>> > > >> > >>> > > The reason to bring this up was discussed in the start of this >> > >>> thread as: >> > >>> > > >> > >>> > > This is with the intention to recover the applications faster >> and >> > do >> > >>> > > away >> > >>> > > > with HDFS's small files problem as described here: >> > >>> > > > >> http://blog.cloudera.com/blog/2009/02/the-small-files-problem/ >> > >>> > > > >> > >>> > > > >> > >>> > > >> > >>> >> http://snowplowanalytics.com/blog/2013/05/30/dealing-with-hadoops-smal >> > >>> > > l-files-problem/ >> > >>> > > > >> > >>> http://inquidia.com/news-and-info/working-small-files-hadoop-part-1 >> > >>> > > > If we could save checkpoints in some other distributed file >> > system >> > >>> > > > (or even a HA NAS box) geared for small files, we could >> achieve - >> > >>> > > > >> > >>> > > > - Better performance of NN & HDFS for the production usage >> > >>> (read: >> > >>> > > > production data I/O & not temp files) >> > >>> > > > >> > >>> > > > >> > >>> > > > - Faster application recovery in case of planned shutdown / >> > >>> > unplanned >> > >>> > > > restarts >> > >>> > > > >> > >>> > > > If you feel the need of this feature, please cast your >> opinions >> > and >> > >>> > > > ideas >> > >>> > > so that it can be converted in a jira. >> > >>> > > >> > >>> > > >> > >>> > > >> > >>> > > Thanks, >> > >>> > > >> > >>> > > >> > >>> > > Aniruddha >> > >>> > > >> > >>> > > On Thu, Jan 21, 2016 at 11:19 PM, Gaurav Gupta >> > >>> > > <[email protected]> >> > >>> > > wrote: >> > >>> > > >> > >>> > > > Aniruddha, >> > >>> > > > >> > >>> > > > Currently we don't have any support for that. >> > >>> > > > >> > >>> > > > Thanks >> > >>> > > > Gaurav >> > >>> > > > >> > >>> > > > Thanks >> > >>> > > > -Gaurav >> > >>> > > > >> > >>> > > > On Thu, Jan 21, 2016 at 12:24 AM, Tushar Gosavi >> > >>> > > > <[email protected]> >> > >>> > > > wrote: >> > >>> > > > >> > >>> > > > > Default FSStorageAgent can be used as it can work with local >> > >>> > > filesystem, >> > >>> > > > > but I far as I know there is no support for specifying the >> > >>> > > > > directory through xml file. by default it use the >> application >> > >>> > directory on HDFS. >> > >>> > > > > >> > >>> > > > > Not sure If we could specify storage agent with its >> properties >> > >>> > > > > through >> > >>> > > > the >> > >>> > > > > configuration at dag level. >> > >>> > > > > >> > >>> > > > > - Tushar. >> > >>> > > > > >> > >>> > > > > >> > >>> > > > > On Thu, Jan 21, 2016 at 12:14 PM, Aniruddha Thombare < >> > >>> > > > > [email protected]> wrote: >> > >>> > > > > >> > >>> > > > > > Hi, >> > >>> > > > > > >> > >>> > > > > > Do we have any storage agent which I can use readily, >> > >>> > > > > > configurable >> > >>> > > > > through >> > >>> > > > > > dt-site.xml? >> > >>> > > > > > >> > >>> > > > > > I am looking for something which would save checkpoints in >> > >>> > > > > > mounted >> > >>> > > file >> > >>> > > > > > system [eg. HA-NAS] which is basically just another >> directory >> > >>> > > > > > for >> > >>> > > Apex. >> > >>> > > > > > >> > >>> > > > > > >> > >>> > > > > > >> > >>> > > > > > >> > >>> > > > > > Thanks, >> > >>> > > > > > >> > >>> > > > > > >> > >>> > > > > > Aniruddha >> > >>> > > > > > >> > >>> > > > > > On Wed, Jan 20, 2016 at 8:33 PM, Sandesh Hegde < >> > >>> > > > [email protected]> >> > >>> > > > > > wrote: >> > >>> > > > > > >> > >>> > > > > > > It is already supported refer the following jira for >> more >> > >>> > > > information, >> > >>> > > > > > > >> > >>> > > > > > > https://issues.apache.org/jira/browse/APEXCORE-283 >> > >>> > > > > > > >> > >>> > > > > > > >> > >>> > > > > > > >> > >>> > > > > > > On Tue, Jan 19, 2016 at 10:43 PM Aniruddha Thombare < >> > >>> > > > > > > [email protected]> wrote: >> > >>> > > > > > > >> > >>> > > > > > > > Hi, >> > >>> > > > > > > > >> > >>> > > > > > > > Is it possible to save checkpoints in any other highly >> > >>> > > > > > > > available distributed file systems (which maybe >> mounted >> > >>> > > > > > > > directories across >> > >>> > > > the >> > >>> > > > > > > > cluster) other than HDFS? >> > >>> > > > > > > > If yes, is it configurable? >> > >>> > > > > > > > >> > >>> > > > > > > > AFAIK, there is no configurable option available to >> > achieve >> > >>> > that. >> > >>> > > > > > > > If that's the case, can we have that feature? >> > >>> > > > > > > > >> > >>> > > > > > > > This is with the intention to recover the applications >> > >>> > > > > > > > faster and >> > >>> > > > do >> > >>> > > > > > away >> > >>> > > > > > > > with HDFS's small files problem as described here: >> > >>> > > > > > > > >> > >>> > > > > > > > >> > >>> http://blog.cloudera.com/blog/2009/02/the-small-files-proble >> > >>> > > > > > > > m/ >> > >>> > > > > > > > >> > >>> > > > > > > > >> > >>> > > > > > > >> > >>> > > > > > >> > >>> > > > > >> > >>> > > > >> > >>> > > >> > >>> >> http://snowplowanalytics.com/blog/2013/05/30/dealing-with-hadoops-smal >> > >>> > > l-files-problem/ >> > >>> > > > > > > > >> > >>> > > > >> > >>> http://inquidia.com/news-and-info/working-small-files-hadoop-part-1 >> > >>> > > > > > > > >> > >>> > > > > > > > If we could save checkpoints in some other distributed >> > file >> > >>> > > system >> > >>> > > > > (or >> > >>> > > > > > > even >> > >>> > > > > > > > a HA NAS box) geared for small files, we could >> achieve - >> > >>> > > > > > > > >> > >>> > > > > > > > - Better performance of NN & HDFS for the >> production >> > >>> > > > > > > > usage >> > >>> > > > (read: >> > >>> > > > > > > > production data I/O & not temp files) >> > >>> > > > > > > > - Faster application recovery in case of planned >> > >>> shutdown >> > >>> > > > > > > > / >> > >>> > > > > > unplanned >> > >>> > > > > > > > restarts >> > >>> > > > > > > > >> > >>> > > > > > > > Please, send your comments, suggestions or ideas. >> > >>> > > > > > > > >> > >>> > > > > > > > Thanks, >> > >>> > > > > > > > >> > >>> > > > > > > > >> > >>> > > > > > > > Aniruddha >> > >>> > > > > > > > >> > >>> > > > > > > >> > >>> > > > > > >> > >>> > > > > >> > >>> > > > >> > >>> > > >> > >>> > >> > >>> >> > >> >> > >> >> > > >> > >> > >
