Thanks guys,
Let me know if I can help you in any way to debug it to the end.
Sumit
On Apr 26, 2014 2:10 AM, "Shrinand Javadekar" <[email protected]>
wrote:

> Sumit,
>
> >> > After all experiments I feel there is a problem in Jclouds.
>
> There may be a problem with jclouds. I was trying to understand the
> test and the environment better so as to get to the bottom of this.
>
>
> On Fri, Apr 25, 2014 at 12:39 AM, Sumit Gaur <[email protected]> wrote:
> > Hi Ignasi
> > https://github.com/sumitkgaur/test
> >
> > 1) Example8.java is original programme and required all jclouds libs.
> > 2) Worker.java is the delayed delete programme.
> >
> > Thanks
> > sumit
> >
> >
> >
> >
> >
> > On Apr 25, 2014 3:45 PM, "Ignasi Barrera" <[email protected]> wrote:
> >
> >> Hi Sumit,
> >>
> >> Could you share the entire code of both programs in a git or pastie so
> we
> >> can understand better how your benchmark works, and reproduce it
> locally?
> >> El 25/04/2014 02:39, "Sumit Gaur" <[email protected]> escribió:
> >>
> >> > Hi Shri,
> >> > After all experiments I feel there is a problem in Jclouds.
> >> >
> >> > 1) I tried retires for every 409 error. After successful retry Jclouds
> >> > started reporting that blob is no more exists but in real it is still
> >> there
> >> > in SWIFT storage.
> >> > 2) I try delaying the delete after 100 puts and voila there are no
> 409 in
> >> > 24 hours. That exactly says there are some race situation in jclouds
> if
> >> we
> >> > do immediate Delete after PUT.
> >> > 3) I know 409 Errors are coming all the way from SWIFT object server
> but
> >> > same is not happening even if I generate much higher "concurrent" load
> >> from
> >> > curl (PUT- GET-DEL) cycle. I was getting TPS of 150.
> >> > 4) I have run SWIFT without any extra daemon like auditor and others
> to
> >> > avoid conflicts because of them. Storage node run only
> >> > object/container/account server.
> >> > 5) To generate concurrent curl load I am sending curl commands in the
> >> > background. I ran this test for 48 hours and not even a single 409
> error.
> >> > 6) For an idea of sequence of client code
> >> >
> >> >  static BlobStoreContext getSwiftClientView() {
> >> >                return ContextBuilder.newBuilder("swift-keystone")
> >> >                           .credentials("test:tester", "test123")
> >> >                           .endpoint("http://a.x.y.z.:5000/v2.0/";)
> >> >                           .buildView(BlobStoreContext.class);
> >> >            }
> >> >
> >> > BlobStoreContext context = getSwiftClientView();
> >> > blobStore = context.getBlobStore();
> >> > blobStore.createContainerInLocation(null, containerName);
> >> > blobStore.blobBuilder(key).payload(file).build();
> >> >
> >> > blobStore.putBlob(containerName, blob);
> >> > getBlob(containerName, key);
> >> > blobStore.removeBlob(containerName, key);
> >> >
> >> > Let me know if you still see any gaps.
> >> > Thanks
> >> > sumit
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On Apr 23, 2014 2:36 AM, "Shrinand Javadekar" <
> [email protected]>
> >> > wrote:
> >> >
> >> > > So there are two problems:
> >> > >
> >> > > 1) 409 when deleting objects.
> >> > > 2) Transactions taking longer after 24-48 hours.
> >> > >
> >> > > For (1), it looks like the request reached the Swift cluster but the
> >> > > Swift cluster itself wasn't able to fulfill it. This could be
> because
> >> > > of the "eventual consistency" semantics of blobstores. When the
> delete
> >> > > request reached Swift, it could have been in the middle of some
> >> > > operation on the object itself (e.g. reading the object for
> >> > > replicating it, auditing it, etc). Jclouds did it's job of actually
> >> > > sending the request. So not sure what else can be done here. Maybe
> we
> >> > > could add retries if the blobstore returns 409. But the main problem
> >> > > lies on the Swift side. The Openstack mailing list would be a better
> >> > > place for asking this question. There are many more Swift experts
> >> > > there.
> >> > >
> >> > > For (2), from the curl example code, it looks like you're creating
> >> > > multiple processes, each doing a put or a delete (no get). This is
> >> > > different from jclouds spawning multiple threads. It would be great
> if
> >> > > the experiments count the number of transactions they're doing and
> >> > > whether they both reach the same number of transactions in the given
> >> > > amount of time. If they do and yet there are less txns via jclouds
> >> > > compared to the shell script, we can conclude that jclouds is the
> >> > > cause of the problem.
> >> > >
> >> > > Now, answering some of the questions below.
> >> > >
> >> > > > It would be great if someone let me know how jcloud delete works.
> Is
> >> > > there
> >> > > > any internal queue while put or delete ? I saw if I put a small
> sleep
> >> > of
> >> > > > 300ms between put n del call, it works fine.
> >> > >
> >> > > I presume the blobstore object you're using in Example9.blobStore is
> >> > > of type "BlobStore" and not "AsyncBlobStore". AsyncBlobStore is
> >> > > deprecated. The BlobStore object is synchronous. There is no queue.
> >> > > When you call removeBlob, the request gets created and sent to the
> >> > > Swift cluster.
> >> > >
> >> > > > Also I assume that jclouds calls are synchronous one n put could
> not
> >> > come
> >> > > > out till object get saved in swift.
> >> > >
> >> > > For the BlobStore type, yes, it is sync.
> >> > >
> >> > > There are some jvm level settings that might also be at play here
> >> > > related to the amount of memory you're allocating to the heap. You
> >> > > could change the memory given to the jvm using the -Xms and -Xmx
> >> > > options.
> >> > >
> >> > > -Shri
> >> > >
> >> > > >  On Apr 22, 2014 11:59 AM, "Sumit Gaur" <[email protected]>
> >> wrote:
> >> > > >
> >> > > >> Hi
> >> > > >> Please find my answer below
> >> > > >>
> >> > > >> On Apr 22, 2014 10:49 AM, "Jasdeep Hundal" <
> >> > > [email protected]>
> >> > > >> wrote:
> >> > > >> >
> >> > > >> > Hey Sumit,
> >> > > >> >
> >> > > >> > I have a couple more questions that might help clarify the
> >> > situation:
> >> > > >> >
> >> > > >> > 1. Are you running the stability test as a single long running
> >> Java
> >> > > >> process
> >> > > >> > (that just keeps cycling through the 10 uploads/gets/deletes)?
> >> > > >> >
> >> > > >>
> >> > > >> Yes. But this process has threads.
> >> > > >>
> >> > > >> > 2. Are you always running the test in the same container, or
> are
> >> you
> >> > > >> > creating new containers for each test iteration?
> >> > > >> >
> >> > > >> No, I am doing roundrobin in 1000 containers
> >> > > >>
> >> > > >> > 3. If the answer to #2 is is that the test runs in a single
> >> > container,
> >> > > >> how
> >> > > >> > many objects does that container currently have?
> >> > > >> >
> >> > > >>
> >> > > >> 0 in ideal case. But as I m facing 409 delete fail also... so
> there
> >> > are
> >> > > >> some objects on each container in hundreds only.
> >> > > >>
> >> > > >> > It may also help to time each of the individual blobstore
> actions
> >> as
> >> > > you
> >> > > >> > run the test to see if any particular one is slowing down.
> >> > > >> >
> >> > > >>
> >> > > >> Even indivitual put and del time increase over the time.
> >> > > >>
> >> > > >> > Jasdeep
> >> > > >> >
> >> > > >> >
> >> > > >> > On Mon, Apr 21, 2014 at 6:21 PM, Sumit Gaur <
> [email protected]
> >> >
> >> > > >> wrote:
> >> > > >> >
> >> > > >> > > hi Shri,
> >> > > >> > > Please find answers below
> >> > > >> > >
> >> > > >> > > On Tue, Apr 22, 2014 at 9:23 AM, Shrinand Javadekar <
> >> > > >> > > [email protected]
> >> > > >> > > > wrote:
> >> > > >> > > Few more questions to try and understand this better:
> >> > > >> > >
> >> > > >> > > 1) On the Swift instance you are using, how many replicas do
> you
> >> > > have?
> >> > > >> > >
> >> > > >> > > 3 replica
> >> > > >> > >
> >> > > >> > > 2) Also, how are you using the curl command in the shell
> script?
> >> > > >> > >
> >> > > >> > > send below command in backgroud for 10 iterations and wait
> >> > similiar
> >> > > to
> >> > > >> the
> >> > > >> > > 10 threads in jclouds.
> >> > > >> > >
> >> > > >> > >             curl -X PUT -i -T 100k -H "X-Auth-Token:
> >> > $OS_AUTH_TOKEN"
> >> > > >> > > http://
> >> > > >> > >
> >> > > >> > >
> >> > > >>
> >> > >
> >> >
> >>
> $PROXY_LOCAL_NET_IP:80/v1/AUTH_${KEYSTONE_ID}/zest1-${cn}/zest1-${k}-${i}-${j}.txt
> >> > > >> > >             curl -X DELETE -i -H "X-Auth-Token:
> $OS_AUTH_TOKEN"
> >> > > http://
> >> > > >> > >
> >> > > >> > >
> >> > > >>
> >> > >
> >> >
> >>
> $PROXY_LOCAL_NET_IP:80/v1/AUTH_${KEYSTONE_ID}/zest1-${cn}/zest1-${k}-${i}-${j}.txt
> >> > > >> > >
> >> > > >> > > I
> >> > > >> > > think the shell script and jclouds-with-10-parallel-threads
> may
> >> > not
> >> > > be
> >> > > >> > > doing the same amount of work. In 20 hours jclouds might be
> >> doing
> >> > > much
> >> > > >> > > more work than the shell script. If you let the shell script
> >> also
> >> > go
> >> > > >> > > upto that point, it might see failures too. Do you know how
> many
> >> > > >> > > PUT-GET-DEL operations have been performed when you start
> seeing
> >> > the
> >> > > >> > > 409 errors.
> >> > > >> > >
> >> > > >> > > Actually 409 errors are coming since the start of the test
> but
> >> TPS
> >> > > >> start
> >> > > >> > > degrading after 24-48 hours.
> >> > > >> > > On Apr 22, 2014 9:23 AM, "Shrinand Javadekar" <
> >> > > [email protected]
> >> > > >> >
> >> > > >> > > wrote:
> >> > > >> > >
> >> > > >> > > > Few more questions to try and understand this better:
> >> > > >> > > >
> >> > > >> > > > 1) On the Swift instance you are using, how many replicas
> do
> >> you
> >> > > >> have?
> >> > > >> > > > 2) Also, how are you using the curl command in the shell
> >> > script? I
> >> > > >> > > > think the shell script and jclouds-with-10-parallel-threads
> >> may
> >> > > not
> >> > > >> be
> >> > > >> > > > doing the same amount of work. In 20 hours jclouds might be
> >> > doing
> >> > > >> much
> >> > > >> > > > more work than the shell script. If you let the shell
> script
> >> > also
> >> > > go
> >> > > >> > > > upto that point, it might see failures too. Do you know how
> >> many
> >> > > >> > > > PUT-GET-DEL operations have been performed when you start
> >> seeing
> >> > > the
> >> > > >> > > > 409 errors.
> >> > > >> > > >
> >> > > >> > > > -Shri
> >> > > >> > > >
> >> > > >> > > >
> >> > > >> > > > On Mon, Apr 21, 2014 at 4:55 PM, Sumit Gaur <
> >> > [email protected]
> >> > > >
> >> > > >> > > wrote:
> >> > > >> > > > > FYI ..This is block of code .....   also I am using
> jclouds
> >> > > 1.7.1
> >> > > >> > > (Stable
> >> > > >> > > > > branch)
> >> > > >> > > > >      try {
> >> > > >> > > > > String key = "objkey" + UUID.randomUUID();
> >> > > >> > > > >                 Blob blob =
> >> > > >> > > > >
> >> > > Example9.blobStore.blobBuilder(key).payload(Example9.file).build();
> >> > > >> > > > >
> >> > > >> > > Example9.blobStore.putBlob(Example9.containerName+count,
> >> > > >> > > > > blob);
> >> > > >> > > > >
> >> > > >> > > Example9.blobStore.getBlob(Example9.containerName+count,
> >> > > >> > > > > key);
> >> > > >> > > > >
> >> > > >> > > > Example9.blobStore.removeBlob(Example9.containerName+count,
> >> > > >> > > > > key);
> >> > > >> > > > >         } catch (Exception ace) {
> >> > > >> > > > >                 System.out.println("Request failed for
> >> objkey
> >> > "
> >> > > +
> >> > > >> key
> >> > > >> > > + "
> >> > > >> > > > >  " + ace);
> >> > > >> > > > >         }
> >> > > >> > > > >
> >> > > >> > > > >
> >> > > >> > > > >
> >> > > >> > > > > On Tue, Apr 22, 2014 at 8:32 AM, Sumit Gaur <
> >> > > [email protected]>
> >> > > >> > > > wrote:
> >> > > >> > > > >
> >> > > >> > > > >> Hi Shri,
> >> > > >> > > > >> Thanks for paying attention to it, Please find my
> answers
> >> > > below:-
> >> > > >> > > > >>
> >> > > >> > > > >>
> >> > > >> > > > >> On Tue, Apr 22, 2014 at 2:31 AM, Shrinand Javadekar <
> >> > > >> > > > >> [email protected]> wrote:
> >> > > >> > > > >>
> >> > > >> > > > >>> Sumit,
> >> > > >> > > > >>>
> >> > > >> > > > >>> I realize that you had sent out a similar email
> sometime
> >> ago
> >> > > >> about
> >> > > >> > > > >>> performance degradation. I'm not sure if anyone has run
> >> > these
> >> > > >> types
> >> > > >> > > of
> >> > > >> > > > >>> long running experiments with jclouds. So this may be a
> >> > first.
> >> > > >> > > > >>>
> >> > > >> > > > >> Tried to debug it in last 2 weeks without success. Want
> to
> >> > > >> understand
> >> > > >> > > > more
> >> > > >> > > > >> how jclouds code handle this use case or any pointers
> that
> >> > this
> >> > > >> is a
> >> > > >> > > > >> problematic use case would help
> >> > > >> > > > >>
> >> > > >> > > > >>>
> >> > > >> > > > >>> The 409 status is returned because of a conflict [1].
> Are
> >> > you
> >> > > >> sure
> >> > > >> > > you
> >> > > >> > > > >>> didn't have two or more threads trying to delete the
> same
> >> > > object?
> >> > > >> > > > >>>
> >> > > >> > > > >> No two threads share the same object key in my programme
> >> > > (String
> >> > > >> key =
> >> > > >> > > > >> "objkey" + UUID.randomUUID();). It is some kind of race
> >> > between
> >> > > >> PUT
> >> > > >> > > and
> >> > > >> > > > >> DEL call . If I put say 10 ms sleep between call then
> there
> >> > is
> >> > > no
> >> > > >> 409
> >> > > >> > > > error.
> >> > > >> > > > >>
> >> > > >> > > > >>
> >> > > >> > > > >>> Also, I see that that 409 is returned by Swift if you
> try
> >> to
> >> > > >> delete a
> >> > > >> > > > >>> container that isn't empty[2]. Is that something your
> test
> >> > > code
> >> > > >> > > > >>> could've tried?
> >> > > >> > > > >>>
> >> > > >> > > > >> I am trying to delete objects .. not containers.
> >> > > >> > > > >>
> >> > > >> > > > >>>
> >> > > >> > > > >>> When you say there was a similar test you're trying
> with
> >> > curl,
> >> > > >> are
> >> > > >> > > you
> >> > > >> > > > >>> using the curl command-line utility or the libcurl
> >> library?
> >> > > >> > > > >>
> >> > > >> > > > >> curl command in shell script with for loops.
> >> > > >> > > > >>
> >> > > >> > > > >>
> >> > > >> > > > >>> How are
> >> > > >> > > > >>> you specifying the number of threads to use and what
> >> object
> >> > > each
> >> > > >> > > > >>> thread should get/put/delete?
> >> > > >> > > > >>>
> >> > > >> > > > >>
> >> > > >> > > > >> It is a java test programme using ThreadPoolExecutor.
> >> > Somthing
> >> > > >> > > similiar
> >> > > >> > > > as
> >> > > >> > > > >> here
> >> > > >> > > > >>
> >> > > >> > > > >>
> >> > > >> > > >
> >> > > >> > >
> >> > > >>
> >> > >
> >> >
> >>
> http://www.javacodegeeks.com/2013/01/java-thread-pool-example-using-executors-and-threadpoolexecutor.html
> >> > > >> > > > >>
> >> > > >> > > > >> Object is a 5KB file. with  key = "objkey" +
> >> > UUID.randomUUID();
> >> > > >> with
> >> > > >> > > > Pool
> >> > > >> > > > >> of 10  threads.
> >> > > >> > > > >>
> >> > > >> > > > >>
> >> > > >> > > > >> Hope this would give a good inside. Let me know if you
> get
> >> > any
> >> > > >> problem
> >> > > >> > > > >> here.
> >> > > >> > > > >>
> >> > > >> > > > >>
> >> > > >> > > > >>>
> >> > > >> > > > >>> Thanks.
> >> > > >> > > > >>> -Shri
> >> > > >> > > > >>>
> >> > > >> > > > >>> [1]
> >> http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
> >> > > >> > > > >>> [2] https://bugs.launchpad.net/horizon/+bug/1096084
> >> > > >> > > > >>>
> >> > > >> > > > >>> On Sun, Apr 20, 2014 at 5:55 PM, Sumit Gaur <
> >> > > >> [email protected]>
> >> > > >> > > > wrote:
> >> > > >> > > > >>> > Hi
> >> > > >> > > > >>> > I using jclouds lib integrated with Openstack Swift+
> >> > > keystone
> >> > > >> > > > >>> combinaiton.
> >> > > >> > > > >>> > Things are working fine except stability test. After
> >> 20-30
> >> > > >> hours of
> >> > > >> > > > test
> >> > > >> > > > >>> > jclouds/SWIFT start degrading in TPS and keep going
> down
> >> > > over
> >> > > >> the
> >> > > >> > > > time.
> >> > > >> > > > >>> >
> >> > > >> > > > >>> > 1) I am running the (PUT-GET-DEL) cycle in 10
> parallel
> >> > > threads.
> >> > > >> > > > >>> > 2) I am getting a lot of 409 and DEL failure for the
> as
> >> > > >> response
> >> > > >> > > too
> >> > > >> > > > >>> from
> >> > > >> > > > >>> > SWIFT.
> >> > > >> > > > >>> > 3) Direct similiar test from curl does not show much
> >> > impact
> >> > > >> and TPS
> >> > > >> > > > >>> remain
> >> > > >> > > > >>> > constant.
> >> > > >> > > > >>> >
> >> > > >> > > > >>> > Can sombody help me wht is going wrong here ?
> >> > > >> > > > >>> >
> >> > > >> > > > >>> > Thanks
> >> > > >> > > > >>> > sumit
> >> > > >> > > > >>>
> >> > > >> > > > >>
> >> > > >> > > > >>
> >> > > >> > > >
> >> > > >> > >
> >> > > >>
> >> > >
> >> >
> >>
>

Reply via email to