I can help Dev. Thanks, Chandni
On Sat, May 7, 2016 at 1:23 PM, Amol Kekre <[email protected]> wrote: > We do have docs on apache.org. Love to a very extensive and deep doc on > this topic. > > Should we add "How to ..." sections? > > @dev, thks for volunteering. Anyone more volunteers? > > Thks, > Amol > > > On Sat, May 7, 2016 at 12:20 PM, Devendra Tagare < > [email protected]> > wrote: > > > @Thomas,@Amol I would like to contribute/collaborate on this. > > > > Will create a ticket for the same. > > > > Thanks, > > Dev > > > > On Sat, May 7, 2016 at 11:04 AM, Thomas Weise <[email protected]> > > wrote: > > > > > The documentation is here and is indexed: > > > > > > http://apex.apache.org/docs/malhar/ > > > > > > I think this is a matter of enhancing it. > > > > > > > > > On Sat, May 7, 2016 at 9:18 AM, Amol Kekre <[email protected]> > wrote: > > > > > > > Thomas and I talked. Both of us agree that a white paper is due to > get > > > > going. Google index clearly beats "find . | grep ..." in this day and > > > age. > > > > > > > > The white paper would walk through and have data on HDFS, FTP, NFS, > S3, > > > > maybe even example apps (could be app properties) accompanying this. > > > > > > > > So any volunteers? > > > > > > > > Thks > > > > Amol > > > > > > > > > > > > On Thu, May 5, 2016 at 5:10 PM, Thomas Weise <[email protected] > > > > > > wrote: > > > > > > > > > Do we have other projects that create dummy classes for every > > possible > > > > > mounted file system just so that the user knows that's possible? > The > > > > > capability that matters here from app perspective is local file > > system > > > > and > > > > > every developer in the Hadoop ecosystem should understand that. > > > > > > > > > > If the operator doesn't have anything specific to NFS then there is > > no > > > > > place for it in the library (it would be confusing, not helpful). > > > > > > > > > > There should be a different approach for pre-configured operators > > that > > > > > doesn't involve writing Java code. > > > > > > > > > > Thomas > > > > > > > > > > > > > > > > > > > > On Thu, May 5, 2016 at 3:10 PM, Amol Kekre <[email protected]> > > > wrote: > > > > > > > > > > > I am not suggesting duplicating code; extend the operators. Just > > add > > > > > > something (may not even be a function) that can be viewed as > > specific > > > > to > > > > > a > > > > > > particular source. Say for NFS, it may be as simple as changing a > > > > > default. > > > > > > A file with NFS in its name help a great deal with adoption. > > > > > > > > > > > > Thks > > > > > > Amol > > > > > > > > > > > > > > > > > > On Thu, May 5, 2016 at 11:45 AM, Chandni Singh < > > > > [email protected]> > > > > > > wrote: > > > > > > > > > > > > > IMO this is not a good idea. > > > > > > > > > > > > > > We are proposing to add additional Java code which is generic > > > (works > > > > > with > > > > > > > HDFS, NFS, local FS) but just calling it something specific - > > NFS. > > > > IMO > > > > > > this > > > > > > > is much more confusing to users. > > > > > > > > > > > > > > If we want to make it easier for users to find out that the FS > > > Module > > > > > > > supports writing to NFS then maybe we need to improve > > documentation > > > > or > > > > > > > highlight it somewhere else. > > > > > > > > > > > > > > Adding java classes means more maintenance overhead and here > > these > > > > > > classes > > > > > > > are not doing anything additional. > > > > > > > > > > > > > > Thanks, > > > > > > > Chandni > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, May 5, 2016 at 11:24 AM, Mohit Jotwani < > > > > [email protected]> > > > > > > > wrote: > > > > > > > > > > > > > > > +1 on Sandeep's suggestion. This would make an end user's > life > > > lot > > > > > more > > > > > > > > easier! > > > > > > > > > > > > > > > > Regards, > > > > > > > > Mohit > > > > > > > > > > > > > > > > On Thu, May 5, 2016 at 11:51 PM, Sandeep Deshmukh < > > > > > > > [email protected] > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > I do agree with Amol on having clear and explicit modules. > > This > > > > is > > > > > > more > > > > > > > > > from an end user perspective. For someone who is new to > Apex, > > > > > having > > > > > > > > > separate NFS, HDFS, FTP, etc would make lot more sense than > > one > > > > > > generic > > > > > > > > FS > > > > > > > > > module. However small change these modules may have, like > > just > > > > > couple > > > > > > > of > > > > > > > > > small functions, I would like to have them separate for the > > end > > > > > user. > > > > > > > > > > > > > > > > > > It is finally about the perspective and the user experience > > :) > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > Sandeep > > > > > > > > > > > > > > > > > > On Thu, May 5, 2016 at 8:48 PM, Thomas Weise < > > > > > [email protected] > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > I don't think we should name something NFS* when it isn't > > > > > specific > > > > > > to > > > > > > > > > NFS. > > > > > > > > > > It is just like any other local FS for this purpose and > > > that's > > > > > > > already > > > > > > > > > > covered by the Hadoop file system abstraction. > > > > > > > > > > > > > > > > > > > > Why can't a single FS Input module accommodate all of > this. > > > > Once > > > > > > you > > > > > > > > know > > > > > > > > > > the FS URL, you can automatically optimize the > > configuration, > > > > if > > > > > > > > > > appropriate. > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Thomas > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, May 5, 2016 at 12:08 AM, Chaitanya Chebolu < > > > > > > > > > > [email protected]> wrote: > > > > > > > > > > > > > > > > > > > > > Hi Chandni, > > > > > > > > > > > > > > > > > > > > > > Its a good point. I created the hierarchy based on > user > > > > > > > perspective > > > > > > > > > and > > > > > > > > > > > especially for non Java users. If I return FileSplitter > > and > > > > > > > > BlockReader > > > > > > > > > > > from FS Input Module, then this module works for NFS. > > But, > > > > for > > > > > > > users > > > > > > > > > > > perspective it would be difficult, whether this module > > > works > > > > > for > > > > > > > NFS > > > > > > > > or > > > > > > > > > > any > > > > > > > > > > > other fileSystem. > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > Chaitanya > > > > > > > > > > > > > > > > > > > > > > On Thu, May 5, 2016 at 11:05 AM, Chandni Singh < > > > > > > > > > [email protected]> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > I am sorry Chaitanya but I have more questions about > > this > > > > > > > > > > > > > > > > > > > > > > > > 1. why is the FS Input Module abstract when by > default > > it > > > > can > > > > > > > > return > > > > > > > > > > > > FileSplitter & BlockReader in > > com.datatorrent.lib.io.fs? > > > > > > > > > > > > These implementations are not specific to NFS. > > > > > > > > > > > > > > > > > > > > > > > > 2. In the NFS module that you have suggested to > create, > > > > what > > > > > is > > > > > > > > > > specific > > > > > > > > > > > to > > > > > > > > > > > > NFS? > > > > > > > > > > > > > > > > > > > > > > > > Please note: I have created a ticket APEXMALHAR-2081 > to > > > > > remove > > > > > > > > > > > > FSFileSplitter from library and move its feature to > the > > > > base > > > > > > > > > operator. > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Chandni > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 4, 2016 at 10:29 PM, Chaitanya Chebolu < > > > > > > > > > > > > [email protected]> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > FSFileSplitter & BlockReader are available in > > > > > > > > > > com.datatorrent.lib.io.fs > > > > > > > > > > > > > package. > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, May 5, 2016 at 10:47 AM, Chandni Singh < > > > > > > > > > > > [email protected]> > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > Ok. What is specific about the fileSplitter and > > > > > blockReader > > > > > > > > > > returned > > > > > > > > > > > by > > > > > > > > > > > > > > this implementation? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On May 4, 2016 9:43 PM, "Chaitanya Chebolu" < > > > > > > > > > > > [email protected] > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Chandni, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Properties wise nothing specific. FS Input > Module > > > is > > > > an > > > > > > > > > abstract > > > > > > > > > > > > Module > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > NFS Module implements the abstract methods - > > > > > > > > > createFileSplitter() > > > > > > > > > > > and > > > > > > > > > > > > > > > createBlockReader(). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > > Chaitanya > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh < > > > > > > > > > > > > [email protected] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Chaitanya, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What will be specific in NFS Input Module > that > > is > > > > not > > > > > > > > > provided > > > > > > > > > > by > > > > > > > > > > > > FS > > > > > > > > > > > > > > > Input > > > > > > > > > > > > > > > > Module? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > Chandni > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre < > > > > > > > > > > [email protected] > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > +1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thks > > > > > > > > > > > > > > > > > Amol > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep > > > > Deshmukh < > > > > > > > > > > > > > > > > [email protected] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > +1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > > > > > Sandeep > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit > > > Jotwani > > > > < > > > > > > > > > > > > > > > [email protected]> > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > +1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > > > > > > Mohit > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 2:09 PM, > > Chaitanya > > > > > > Chebolu > > > > > > > < > > > > > > > > > > > > > > > > > > > [email protected]> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi All, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I am proposing NFS Input Module. > Use > > > case > > > > > is > > > > > > to > > > > > > > > > read > > > > > > > > > > > > large > > > > > > > > > > > > > > > files > > > > > > > > > > > > > > > > > from > > > > > > > > > > > > > > > > > > > NFS > > > > > > > > > > > > > > > > > > > > in parallel. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Design of NFS input module: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > There is a common interface > > > > > "FSInputModule" > > > > > > in > > > > > > > > > > Malhar > > > > > > > > > > > > for > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > input > > > > > > > > > > > > > > > > > > > > Modules. NFS input Module extends > from > > > > > > > > FSInputModule > > > > > > > > > > and > > > > > > > > > > > > can > > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > > achieved > > > > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > > using FSFileSplitter and BlockReader > > > > > operators. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please share your thoughts on this. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > > > > > > > Chaitanya > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
