Changing the storage agent is one of the ways to address the symptoms of
the problem. But it's not treating the problem.

In this case - change the basePath to a location under target and all the
opinions are moot. And someone is claiming that we should not do it. Not
sure why. Or is anyone claiming that anymore?

--
Chetan

On Thu, Sep 3, 2015 at 1:43 PM, Thomas Weise <[email protected]> wrote:

> Good point regarding the coverage. These JUnit tests are supposed to test
> individual components and all the tests collectively should strive to
> achieve high coverage. There are tests in Apex to cover storage agents,
> recovery semantics etc.  Components that fall outside of the test scope are
> reduced as much as possible through mocks (even though there is room for
> improvement).
>
> The tests in Malhar are for operators and applications, not for the engine.
> In those cases where LM is used, the intention is to test the application
> functionality. It is expected that certain configurations are adjusted for
> the test and dependencies mocked.
>
> For the local mode, it should not be an issue to use a different storage
> agent when it simplifies the test execution. Specifically, in this case, we
> don't want to go and change many tests to make something work that isn't
> needed. LM is not "production", it is not using HDFS and there are a number
> of other important differences that make it possible to run within the IDE.
>
> Instead, focus should be on those things that help with app coverage. For
> example, in the past we had seen issues with serialization of operators
> that were not uncovered in LM, until we made the serialization part of the
> execution, even when it was not needed for execution.
>
> Thomas
>
>
>
>
> On Thu, Sep 3, 2015 at 11:21 AM, Chetan Narsude <[email protected]>
> wrote:
>
> > I think Ram explained in a little more detail on what I am thinking.
> >
> > Tests are supposed to provide code coverage. Having localcluster is
> already
> > a variable, it's not what runs in production. Having a different storage
> > agent is another variable and it misses out on testing the asynchronous
> > flow. The gap keeps on increasing if we continue to do that.  AsyncFSSA
> is
> > our default because it's supposed to do everything that FSStorageAgent
> does
> > and some more. So not clear as to why the test which creates stray
> folders
> > is not configuring the storage agent properly instead of completely
> > changing it out which brings some other problems in as I just explained.
> >
> > If changing the storage agent is the only way to fix the problem with
> > reasonable effort, then I would concede. I highly doubt that.
> >
> > --
> > Chetan
> >
> > On Thu, Sep 3, 2015 at 11:05 AM, Chandni Singh <[email protected]>
> > wrote:
> >
> > > The local mode was so far using FSStorageAgent which was used in
> > > production.
> > > In production using Async is needed because hdfs writes are slow but is
> > > that the case with LocalMode?
> > >
> > > In local mode if we use Async we are creating checkpoints under one
> local
> > > directory and then copying it to another local directory which will not
> > > improve any performance.
> > >
> > > In my opinion StramLocalCluster use synchronous checkpointing as
> default.
> > >
> > > Chandni
> > >
> > >
> > >
> > > On Thu, Sep 3, 2015 at 10:09 AM, Chetan Narsude <
> [email protected]>
> > > wrote:
> > >
> > >> That sounds a lot like self contradicting reason; Let's make a change
> > >> because we don't want to make change. :-)
> > >>
> > >> The code is in certain state. This certain state is consistent with
> how
> > >> things run in production. In test environment there is a problem that
> > stray
> > >> files are created. It's a small fix to relocate these files elsewhere.
> > What
> > >> I am trying to understand is that is not being done?
> > >>
> > >> --
> > >> Chetan
> > >>
> > >> On Thu, Sep 3, 2015 at 9:41 AM, Thomas Weise <[email protected]>
> > >> wrote:
> > >>
> > >>> There is no need to configure anything extra with the proposed
> change,
> > it
> > >>> just brings back LM to how it worked before.
> > >>>
> > >>> There is no point modifying n tests for extra setup with no gain.
> > >>>
> > >>> Thomas
> > >>>
> > >>> On Thu, Sep 3, 2015 at 9:14 AM, Chetan Narsude <
> [email protected]
> > >
> > >>> wrote:
> > >>>
> > >>> > Why does it matter that AsyncFSStorageAgent is being used with
> > >>> > LocalCluster? It using the localfs and hence no gain is the
> > >>> implementation
> > >>> > detail that's abstracted out by FileSystem already.
> > >>> >
> > >>> > If there is a problem of random artifacts left behind after the
> test,
> > >>> there
> > >>> > is a reason and most likely it's misconfiguration of the
> > StorageAgent.
> > >>> Why
> > >>> > wouldn't that be fixed.
> > >>> >
> > >>> > --
> > >>> > Chetan
> > >>> >
> > >>> >
> > >>> > On Thu, Sep 3, 2015 at 8:59 AM, Amol Kekre <[email protected]>
> > >>> wrote:
> > >>> >
> > >>> > > Clean up container files left over should be a distributed OS
> task.
> > >>> Clean
> > >>> > > up, back up, archive, ... all is for the OS (aka YARN). We must
> > >>> assume
> > >>> > kill
> > >>> > > -9.
> > >>> > >
> > >>> > > The only thing where the operator comes into play is
> "teardown()",
> > >>> which
> > >>> > is
> > >>> > > business logic (not Apex engine) issue. This could be db
> connection
> > >>> etc.
> > >>> > >
> > >>> > > Thks,
> > >>> > > Amol
> > >>> > >
> > >>> > > On Thu, Sep 3, 2015 at 8:52 AM, Thomas Weise <
> > [email protected]
> > >>> >
> > >>> > > wrote:
> > >>> > >
> > >>> > > > When the container gets killed, we should not assume anything
> > about
> > >>> > > > cleanup. It can be a kill -9. Any related "cleanup" falls under
> > >>> nice to
> > >>> > > > have, no guarantees.
> > >>> > > >
> > >>> > > > On Thu, Sep 3, 2015 at 8:49 AM, Chandni Singh <
> > >>> [email protected]
> > >>> > >
> > >>> > > > wrote:
> > >>> > > >
> > >>> > > > > I have a question regarding what Gaurav mentioned
> > >>> > > > > ----
> > >>> > > > > When container runs in cluster, "." specifies the containers
> > >>> local
> > >>> > path
> > >>> > > > on
> > >>> > > > > the node where container specific jars and other resources
> > >>> resides.
> > >>> > It
> > >>> > > > > creates a folder under that which is live as long as
> container
> > >>> lives.
> > >>> > > So
> > >>> > > > > there are no vagrant folders anywhere
> > >>> > > > > ---
> > >>> > > > >
> > >>> > > > > When the container gets killed, do we cleanup the folders
> > >>> created by
> > >>> > > > Async
> > >>> > > > > under the containers working dir?
> > >>> > > > >
> > >>> > > > > On Thu, Sep 3, 2015 at 8:42 AM, Thomas Weise <
> > >>> [email protected]
> > >>> > >
> > >>> > > > > wrote:
> > >>> > > > >
> > >>> > > > >> It makes sense to use the synchronous checkpointing for the
> > >>> local
> > >>> > > mode.
> > >>> > > > >> LM is meant to simplify dependencies and setup. The default
> > for
> > >>> > > > execution
> > >>> > > > >> on YARN remains async.
> > >>> > > > >>
> > >>> > > > >> Thomas
> > >>> > > > >>
> > >>> > > > >>
> > >>> > > > >> On Thu, Sep 3, 2015 at 8:34 AM, Chandni Singh <
> > >>> > > [email protected]>
> > >>> > > > >> wrote:
> > >>> > > > >>
> > >>> > > > >>> APPLICATION_PATH isn't related to local base dir of Async
> as
> > >>> far
> > >>> > as I
> > >>> > > > >>> know. StramLocalCluster sets the APP_PATH to "target/...".
> > >>> > > > >>> StramLocalCluster should use FSStorageAgent.
> > >>> > > > >>>
> > >>> > > > >>> - Chandni
> > >>> > > > >>>
> > >>> > > > >>> On Thu, Sep 3, 2015 at 8:20 AM, Gaurav Gupta <
> > >>> > [email protected]
> > >>> > > >
> > >>> > > > >>> wrote:
> > >>> > > > >>>
> > >>> > > > >>>> As Thomas mentioned as default remains to be async. You
> can
> > >>> either
> > >>> > > > >>>> change the storage agent or set the APPLICATION_PATH.
> > >>> > > > >>>>
> > >>> > > > >>>> When container runs in cluster, "." specifies the
> containers
> > >>> local
> > >>> > > > path
> > >>> > > > >>>> on the node where container specific jars and other
> > resources
> > >>> > > > resides. It
> > >>> > > > >>>> creates a folder under that which is live as long as
> > container
> > >>> > > lives.
> > >>> > > > So
> > >>> > > > >>>> there are no vagrant folders anywhere
> > >>> > > > >>>>
> > >>> > > > >>>> Thanks
> > >>> > > > >>>> -Gaurav
> > >>> > > > >>>>
> > >>> > > > >>>> On Wed, Sep 2, 2015 at 11:33 PM, Chandni Singh <
> > >>> > > > [email protected]
> > >>> > > > >>>> > wrote:
> > >>> > > > >>>>
> > >>> > > > >>>>> I think there is a problem in the default Async as well.
> It
> > >>> also
> > >>> > > uses
> > >>> > > > >>>>> the working directory as its local base path.
> > >>> > > > >>>>>
> > >>> > > > >>>>> In the Async -> copyToHdfs()  method, we delete the
> window
> > >>> files
> > >>> > > but
> > >>> > > > >>>>> the folder with the operator name never gets deleted.
> > >>> > > > >>>>> So on the cluster there  will be such vagrant folders in
> > the
> > >>> > > working
> > >>> > > > >>>>> directory?
> > >>> > > > >>>>>
> > >>> > > > >>>>> On Wed, Sep 2, 2015 at 11:17 PM, Thomas Weise <
> > >>> > > > [email protected]>
> > >>> > > > >>>>> wrote:
> > >>> > > > >>>>>
> > >>> > > > >>>>>> Chandni,
> > >>> > > > >>>>>>
> > >>> > > > >>>>>> Agreed. See whether the tests work with the synchronous
> > >>> storage
> > >>> > > > >>>>>> agent. If yes, change them. The default needs to remain
> > >>> async.
> > >>> > > > >>>>>>
> > >>> > > > >>>>>> Thomas
> > >>> > > > >>>>>>
> > >>> > > > >>>>>>
> > >>> > > > >>>>>> On Wed, Sep 2, 2015 at 11:05 PM, Chandni Singh <
> > >>> > > > >>>>>> [email protected]> wrote:
> > >>> > > > >>>>>>
> > >>> > > > >>>>>>> Hi,
> > >>> > > > >>>>>>>
> > >>> > > > >>>>>>> I would like to know what was the reason to use
> > >>> > > AsyncFSStorageAgent
> > >>> > > > >>>>>>> with StramLocalCluster?
> > >>> > > > >>>>>>> StramLocalCluster is mainly for testing in a
> > >>> non-distributed
> > >>> > mode
> > >>> > > > >>>>>>> and I am unclear how AsyncFSStorageAgent is helpful in
> > this
> > >>> > mode.
> > >>> > > > >>>>>>>
> > >>> > > > >>>>>>> Thanks,
> > >>> > > > >>>>>>> Chandni
> > >>> > > > >>>>>>>
> > >>> > > > >>>>>>> On Wed, Sep 2, 2015 at 10:45 PM, Chandni Singh <
> > >>> > > > >>>>>>> [email protected]> wrote:
> > >>> > > > >>>>>>>
> > >>> > > > >>>>>>>> This is because of recent changes to StramLocalCluster
> > >>> where
> > >>> > > > >>>>>>>> AsyncFSStorageAgent is used for checkpointing
> > >>> > > > >>>>>>>>
> > >>> > > > >>>>>>>> dag.setAttribute(OperatorContext.STORAGE_AGENT, new
> > >>> > > > AsyncFSStorageAgent(new Path(pathUri,
> > >>> > > > LogicalPlan.SUBDIR_CHECKPOINTS).toString(), null));
> > >>> > > > >>>>>>>>
> > >>> > > > >>>>>>>> The AsyncFSStorageAgent(String path, Configuration
> conf)
> > >>> uses
> > >>> > > "."
> > >>> > > > as localBasePath and therefore creates sub-directories per
> > >>> operator in
> > >>> > > the
> > >>> > > > current working directory.
> > >>> > > > >>>>>>>>
> > >>> > > > >>>>>>>> I am going to create a ticket to address this and will
> > >>> fix it.
> > >>> > > > >>>>>>>>
> > >>> > > > >>>>>>>> -Chandni
> > >>> > > > >>>>>>>>
> > >>> > > > >>>>>>>>
> > >>> > > > >>>>>>>> On Wed, Sep 2, 2015 at 7:13 PM, Chandni Singh <
> > >>> > > > >>>>>>>> [email protected]> wrote:
> > >>> > > > >>>>>>>>
> > >>> > > > >>>>>>>>> Hi,
> > >>> > > > >>>>>>>>>
> > >>> > > > >>>>>>>>> I can see empty folders getting created under
> > Malhar/lib
> > >>> > called
> > >>> > > > >>>>>>>>> '1' and '2'.
> > >>> > > > >>>>>>>>> I think this is because of using LocalMode to run a
> > test
> > >>> > > > >>>>>>>>> application.
> > >>> > > > >>>>>>>>>
> > >>> > > > >>>>>>>>>
> > >>> > > > >>>>>>>>> If anyone has checked in such cases please do check
> and
> > >>> let
> > >>> > us
> > >>> > > > >>>>>>>>> know.
> > >>> > > > >>>>>>>>>
> > >>> > > > >>>>>>>>> Thanks,
> > >>> > > > >>>>>>>>> Chandni
> > >>> > > > >>>>>>>>>
> > >>> > > > >>>>>>>>
> > >>> > > > >>>>>>>>
> > >>> > > > >>>>>>>
> > >>> > > > >>>>>>
> > >>> > > > >>>>>
> > >>> > > > >>>>
> > >>> > > > >>>
> > >>> > > > >>
> > >>> > > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> > >>
> > >>
> > >
> >
>

Reply via email to