Re: [infinispan-dev] Dynamic Externalizers sorted for ISPN-986 - looking for some feedback

2011-04-12 Thread Dan Berindei
+1 for Externalizer/AdvancedExternalizer.

Dan


On Tue, Apr 12, 2011 at 10:17 AM, Galder Zamarreño gal...@redhat.com wrote:

 On Apr 11, 2011, at 12:06 PM, Galder Zamarreño wrote:

 Guys, any thoughts on this? I want this in for BETA2...

 On Apr 1, 2011, at 5:54 PM, Galder Zamarreño wrote:

 Hi,

 Re: https://issues.jboss.org/browse/ISPN-986

 As indicated in my comments, there's two room for two types of 
 serialization mechanisms: one for end users and the other for SPIs.

 I've got a solution for this in 
 https://github.com/galderz/infinispan/commit/09096f7998c0d0a5aae76d55bf59c72fe1cb510e
  and wanted to give a heads up to everyone on what it involves:

 - Two separate externalizer interfaces: Externalizer (which currently, to 
 disrupt as little code as possible, is named EndUserExternalier) and 
 ExternalizerSpi or ServiceProviderExternalizer (currently named 
 Externalizer). The first API is basic read/write methods and the second one 
 with a couple of more methods for more specialised behaivour. Do people 
 like these names? Or can someone come up with better names? More detailed 
 info on the use cases in the JIRA.

 I'm currently leaning towards: Externalizer (only readObject/writeObject 
 methods) and ExternalizerSpi (would contain the current 
 https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/marshall/Externalizer.java)

 Actually, I don't like ExternalizerSpi name, it makes configuration getters 
 ackward: getExternalizerSpis...

 Instead I'm leaning for: Externalizer and AdvancedExternalizer which maps 
 very nicely with Cache and AdvancedCache and makes getters more readable: 
 getAdvancedExternalizers much clearer :)



 - A related factor here would be to find a better name for the 
 XML/programmatic configuration, i.e. getServiceProviderExternalizers()? 
 serviceProviderExternalzer or getExternalizeSpis() externalizerSpi? 
 This is one thing and the other is that I'd want this XML and programmatic 
 configuration to be a bit hidden away cos it's specialised or for edge 
 cases. The obvious route the average Infinispan user should be annotation 
 and implement Infinispan's Externalizer interface. However, I'm don't think 
 there's anything special that can be done in the current architechture of 
 Infinispan without rethinking end user and spi configuration.

 Since the XML is only relevant for the SPI version, the programmatic API 
 would go along the lines of getExternalizerSpis() and externalizerSpi - 
 Naming methods like this gives a direct link to the interface name while not 
 being too verbose


 - To hide JBoss Marshaller details away and to simplify some of the API it 
 provides, I've created a new @MarshallableBy annotation that maps directly 
 to what JBMAR's @Externalizer does. To get an idea of the differences for 
 the end users, see 
 https://github.com/galderz/infinispan/commit/09096f7998c0d0a5aae76d55bf59c72fe1cb510e#diff-10
  as opposed to 
 https://github.com/galderz/infinispan/commit/09096f7998c0d0a5aae76d55bf59c72fe1cb510e#diff-9.
  Are people happy with the annotation name? The cool thing is that if 
 someone wants to really use JBoss Marshaller Externalizers, they can, but I 
 think the majority will be happy with just a read/write object method.

 And that's about it! Afterwards it just needs proper documentation in wiki 
 and javadocs, but right now I'm mostly focused at getting naming right. 
 Thoughts?

 Ideally I'd like to get this into BETA1 (release date next Tuesday, 5th 
 April), but I'll prob hold till BETA2 to get the naming right.

 Cheers,
 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] [Pull Request] Modular Classloading Compatibility

2011-05-04 Thread Dan Berindei
On Wed, May 4, 2011 at 10:14 AM, Galder Zamarreño gal...@redhat.com wrote:

 On May 3, 2011, at 2:33 PM, Sanne Grinovero wrote:

 2011/5/3 이희승 (Trustin Lee) trus...@gmail.com:
 On 05/03/2011 05:08 AM, Sanne Grinovero wrote:
 2011/5/2 Manik Surtani ma...@jboss.org:

 On 1 May 2011, at 13:38, Pete Muir wrote:

 As in, user API?  That's a little intrusive... e.g., put(K, V, cl) ?

 Not for put, since you have the class, just get, and I was thinking
 something more like:

 Foo foo = getUsing(key, Foo.class)

 This would be a pretty useful addition to the API anyway to avoid user 
 casts.

 Maybe as an advanced API, so as not to pollute the basic API?  A bit 
 like:

 Foo f = cache.getAdvancedCache().asClass(Foo.class).get(key);

 doesn't look much better than a cast, but is more cryptical :)

 getting back to the classloader issue, what about:

 Cache c = cacheManager.getCache( cacheName, classLoader );

 or
 Cache c = cacheManager.getCache( cacheName ).usingClassLoader(classLoader 
 );

 BTW if that's an issue on the API, maybe you should propose it to
 JSR-107 as well ?

 We have a configurable Marshaller, right?  Then why don't we just use
 the class loader that the current Marshaller uses?

 +1
 I like the clean approach, not sure how you configure the current
 Marshaller to use the correct CL ?
 Likely hard to do via configuration file :)

 Well, the marshaller is a global component and so it's a cache manager level. 
 You can't make any assumptions about it's classloader, particularly when lazy 
 deserialization is configured and you want to make sure that the data of the 
 cache is deserialized with the correct classloader when the user reads the 
 data from the cache. This is gonna become even more important when we for 
 example move to having a single cache for all 2LC entities or all EJB3 SFSBs 
 where we'll definitely need multiple classloaders to access a single cache.


The current unmarshaller uses the TCCL, which is a great idea for
non-modular environments and will still work in AS7 for application
classes (so it's still a good default). It probably won't work if
Hibernate wants to store its own classes in the cache, because
Hibernate's internal classes may not be reachable from the
application's classloader.

It gets even trickier if Hibernate wants to store a
PrivateHibernateCollection class in the cache containing instances of
application classes inside. Then I don't think there will be any
single classloader that could reach both the Hibernate classes and the
application classes so it can properly unmarshal both. Perhaps that's
just something for the Hibernate folks to worry about? Or maybe we
should allow the users to register more than one classloader with a
cache?

Dan

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] [Pull Request] Modular Classloading Compatibility

2011-05-04 Thread Dan Berindei
On Wed, May 4, 2011 at 7:18 AM, 이희승 (Trustin Lee) trus...@gmail.com wrote:
 On 05/03/2011 09:33 PM, Sanne Grinovero wrote:
 2011/5/3 이희승 (Trustin Lee) trus...@gmail.com:
 On 05/03/2011 05:08 AM, Sanne Grinovero wrote:
 2011/5/2 Manik Surtani ma...@jboss.org:

 On 1 May 2011, at 13:38, Pete Muir wrote:

 As in, user API?  That's a little intrusive... e.g., put(K, V, cl) ?

 Not for put, since you have the class, just get, and I was thinking
 something more like:

 Foo foo = getUsing(key, Foo.class)

 This would be a pretty useful addition to the API anyway to avoid user 
 casts.

 Maybe as an advanced API, so as not to pollute the basic API?  A bit 
 like:

 Foo f = cache.getAdvancedCache().asClass(Foo.class).get(key);

 doesn't look much better than a cast, but is more cryptical :)

 getting back to the classloader issue, what about:

 Cache c = cacheManager.getCache( cacheName, classLoader );

 or
 Cache c = cacheManager.getCache( cacheName ).usingClassLoader(classLoader 
 );

 BTW if that's an issue on the API, maybe you should propose it to
 JSR-107 as well ?

 We have a configurable Marshaller, right?  Then why don't we just use
 the class loader that the current Marshaller uses?

 +1
 I like the clean approach, not sure how you configure the current
 Marshaller to use the correct CL ?
 Likely hard to do via configuration file :)

 By default, we could use the class loader that the current Unmarshaller
 uses, and let user choose a class loader for a certain get() call.

 So, we need to deal with the two issues here:

 1) Make sure that user can configure the default class loader in runtime
 and calling get() (and consequently related unmarshaller) will use the
 default class loader:

   cache.setDefaultClassLoader(UserClass.class.getClassLoader())

 2) Provide an additional API that allows a user to specify a class
 loader for the current transaction or context:

   cache.get(K key, ClassV clazz) // +1 to Pete's suggestion


If V is SetMyObject then SetMyObject.class.getClassLoader() will
give you the system classloader, which in most cases won't be able the
MyObject class. It may not even be a SetMyObject of course, it could
be just Set?.

We could have a shortcut so the users can pass in a class instead of
a classloader to avoid writing .getClassLoader(), but we shouldn't
require any relation between the class passed in and the class of the
return value.

Dan

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] [Pull Request] Modular Classloading Compatibility

2011-05-04 Thread Dan Berindei
On Wed, May 4, 2011 at 5:09 PM, Pete Muir pm...@redhat.com wrote:

 On 4 May 2011, at 05:34, Dan Berindei wrote:

 On Wed, May 4, 2011 at 10:14 AM, Galder Zamarreño gal...@redhat.com wrote:

 On May 3, 2011, at 2:33 PM, Sanne Grinovero wrote:

 2011/5/3 이희승 (Trustin Lee) trus...@gmail.com:
 On 05/03/2011 05:08 AM, Sanne Grinovero wrote:
 2011/5/2 Manik Surtani ma...@jboss.org:

 On 1 May 2011, at 13:38, Pete Muir wrote:

 As in, user API?  That's a little intrusive... e.g., put(K, V, cl) ?

 Not for put, since you have the class, just get, and I was thinking
 something more like:

 Foo foo = getUsing(key, Foo.class)

 This would be a pretty useful addition to the API anyway to avoid user 
 casts.

 Maybe as an advanced API, so as not to pollute the basic API?  A bit 
 like:

 Foo f = cache.getAdvancedCache().asClass(Foo.class).get(key);

 doesn't look much better than a cast, but is more cryptical :)

 getting back to the classloader issue, what about:

 Cache c = cacheManager.getCache( cacheName, classLoader );

 or
 Cache c = cacheManager.getCache( cacheName 
 ).usingClassLoader(classLoader );

 BTW if that's an issue on the API, maybe you should propose it to
 JSR-107 as well ?

 We have a configurable Marshaller, right?  Then why don't we just use
 the class loader that the current Marshaller uses?

 +1
 I like the clean approach, not sure how you configure the current
 Marshaller to use the correct CL ?
 Likely hard to do via configuration file :)

 Well, the marshaller is a global component and so it's a cache manager 
 level. You can't make any assumptions about it's classloader, particularly 
 when lazy deserialization is configured and you want to make sure that the 
 data of the cache is deserialized with the correct classloader when the 
 user reads the data from the cache. This is gonna become even more 
 important when we for example move to having a single cache for all 2LC 
 entities or all EJB3 SFSBs where we'll definitely need multiple 
 classloaders to access a single cache.


 The current unmarshaller uses the TCCL, which is a great idea for
 non-modular environments and will still work in AS7 for application
 classes (so it's still a good default). It probably won't work if
 Hibernate wants to store its own classes in the cache, because
 Hibernate's internal classes may not be reachable from the
 application's classloader.

 It gets even trickier if Hibernate wants to store a
 PrivateHibernateCollection class in the cache containing instances of
 application classes inside. Then I don't think there will be any
 single classloader that could reach both the Hibernate classes and the
 application classes so it can properly unmarshal both. Perhaps that's
 just something for the Hibernate folks to worry about? Or maybe we
 should allow the users to register more than one classloader with a
 cache?

 You need to use a bridge classloader in this case.

You're right of course, Hibernate must have received/guessed the
application's classloader and so it is able to create a bridge
classloader that includes both.

And if the application classes include references to classes from
another module(s), the application has to provide a bridge classloader
to Hibernate anyway.

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] [Pull Request] Modular Classloading Compatibility

2011-05-04 Thread Dan Berindei
2011/5/4 Pete Muir pm...@redhat.com:

 On 4 May 2011, at 09:55, Dan Berindei wrote:

 On Wed, May 4, 2011 at 7:18 AM, 이희승 (Trustin Lee) trus...@gmail.com 
 wrote:
 On 05/03/2011 09:33 PM, Sanne Grinovero wrote:
 2011/5/3 이희승 (Trustin Lee) trus...@gmail.com:
 On 05/03/2011 05:08 AM, Sanne Grinovero wrote:
 2011/5/2 Manik Surtani ma...@jboss.org:

 On 1 May 2011, at 13:38, Pete Muir wrote:

 As in, user API?  That's a little intrusive... e.g., put(K, V, cl) ?

 Not for put, since you have the class, just get, and I was thinking
 something more like:

 Foo foo = getUsing(key, Foo.class)

 This would be a pretty useful addition to the API anyway to avoid user 
 casts.

 Maybe as an advanced API, so as not to pollute the basic API?  A bit 
 like:

 Foo f = cache.getAdvancedCache().asClass(Foo.class).get(key);

 doesn't look much better than a cast, but is more cryptical :)

 getting back to the classloader issue, what about:

 Cache c = cacheManager.getCache( cacheName, classLoader );

 or
 Cache c = cacheManager.getCache( cacheName 
 ).usingClassLoader(classLoader );

 BTW if that's an issue on the API, maybe you should propose it to
 JSR-107 as well ?

 We have a configurable Marshaller, right?  Then why don't we just use
 the class loader that the current Marshaller uses?

 +1
 I like the clean approach, not sure how you configure the current
 Marshaller to use the correct CL ?
 Likely hard to do via configuration file :)

 By default, we could use the class loader that the current Unmarshaller
 uses, and let user choose a class loader for a certain get() call.

 So, we need to deal with the two issues here:

 1) Make sure that user can configure the default class loader in runtime
 and calling get() (and consequently related unmarshaller) will use the
 default class loader:

  cache.setDefaultClassLoader(UserClass.class.getClassLoader())

 2) Provide an additional API that allows a user to specify a class
 loader for the current transaction or context:

  cache.get(K key, ClassV clazz) // +1 to Pete's suggestion


 If V is SetMyObject then SetMyObject.class.getClassLoader() will
 give you the system classloader, which in most cases won't be able the
 MyObject class. It may not even be a SetMyObject of course, it could
 be just Set?.

 We could have a shortcut so the users can pass in a class instead of
 a classloader to avoid writing .getClassLoader(), but we shouldn't
 require any relation between the class passed in and the class of the
 return value.

 There are two forces at play here. Firstly the desire to introduce greater 
 type safety into the Cache API, by returning a type for user, rather than 
 just an object, and secondly the need to specify the CL.

 We could have an API like:

 V, CL V get(Object key, ClassCL classLoaderType);

 and an overloaded form

 V V get(Object key, ClassLoader cl);

 This does involve Infinispan having to do a runtime cast to V without the 
 class type passed in as a parameter, but this may be a necessary evil in the 
 interest of improving the user api.


CacheK,V extends MapK,V, so we already have runtime casts to V for

V get(Object key)

It seems natural to extend it to also return objects of the cache's V
type parameter instead of returning an arbitrary type:

CL V get(Object key, ClassCL classLoaderType);
V get(Object key, ClassLoader cl);

Besides, allowing get() to return an arbitrary type will only make the
user code look more type safe, without changing its behaviour at all.
If the user wants type safety he can already define different caches
with different V type parameters and in that case we will actually
enforce that the put argument is assignable to V at compile time.

 An alternative which is somewhat more verbose, somewhat safer, and somewhat 
 more complex to explain (but ultimately more logical):

 // Use the classloader of the expectedType, this would be the normal case
 V V get(Object key, ClassV expectedType);

 // Specify the classloader to use as a class
 V, CL V get(Object key, ClassV expectedType, ClassCL classLoader);

 // Specify the classloader to use as a classloader
 V V get(Object key, ClassV expectedType, ClassLoader classLoader);

 We could also include

 // Use the TCCL
 V get(Object key);

 Any thoughts?


I'd allow the user to set another default classloader on the cache
instead of the TCCL, like in Sanne's proposal:

Cache c = cacheManager.getCache( cacheName, defaultClassLoader );
or
Cache c = cacheManager.getCache( cacheName ).usingDefaultClassLoader(
classLoader );

Dan

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] [Pull Request] Modular Classloading Compatibility

2011-05-07 Thread Dan Berindei
On Thu, May 5, 2011 at 10:05 PM, Emmanuel Bernard
emman...@hibernate.org wrote:
 Quick question.
 In case of 2.a (ie setting the TCCL), this is a requirement for frameworks 
 and libraries using Infinispan, right? Not / never for a user application 
 using Infinispan (in this case it seems that the application class loader 
 will do the right thing and is set as the TCCL by the AS environment)?


My understanding is that we only need to worry about the TCCL if a
library is creating threads and is sharing those threads between
applications - when you create a new thread it inherits the TCCL from
it's parent thread. I'm not sure if we should expect the library to
set the TCCL around user callbacks in this case though, because the
TCCL is only necessary + guaranteed to work in a JEE environment: in a
standalone application you have basically one classloader, in an OSGi
application the TCCL is not guaranteed to be set properly anyway.

It's also possible that an application will want to store in the cache
an object whose class is not accessible through the application's
classloader.
E.g. if another module has an interface A and a factory for A objects
in an exported package that's accessible from the application's
classloader but the actual implementation class is not exported and so
it's not accessible.

For option 2.b we already have the InvocationContext, we only need to
add the classloader to it. Even if we would go the 2.a route, for
async calls I think we can unmarshal the object on a different thread
(which I also think could be shared between apps), so we may need to
save the TCCL at the time of the get() call in the InvocationContext
and use that for unmarshalling.

So I'd choose 2.b with TCCL as the default classloader. The only
catch, as Galder said, is we have to be very careful not to forget
invocation contexts lying around or we will leak classloaders.

Dan



 On 5 mai 2011, at 10:55, Pete Muir wrote:

 I talked about this with Emmanuel last night, and we were

 a) concerned about the pollution of the API that this implies
 b) not sure why we need to do this

 So I also spoke to Jason to understand his initial motivation, and from this 
 chat I think it's clear that things have gone off course here.

 Jason raised two issues with modularity/classloading in Infinispan:

 1) Using the TCCL as the classloader to load Infinispan classes is a bad 
 idea. Instead we should use 
 org.infinispan.foo.Bar.getClass().getClassLoader().

 2) When we need to load application classes we need a different approach to 
 that used today. Most of the time the TCCL is perfect for this. However 
 *sometimes* Infinispan may be invoked outside of the application, when the 
 TCCL may not be set in AS7. Jason and I discussed three options:

 a) require (through a platform integration documentation contract) that the 
 TCCL must always be set when Infinispan is invoked.
 b) Have some sort of InvocationContext which knows what the correct 
 classloader to use is (aka Hibernate/Seam/Weld design where there is a 
 per-application construct based on a ThreadLocal). Given this hasn't been 
 designed into the core, it seems like a large retrofit
 c) Make users specify the CL (directly or indirectly) via the API (as we 
 discussed).

 Personally I think that (a) is the best approach right now, and is not that 
 onerous a requirement.

 We might want to make the TCCL usage pluggable for OSGI envs. Cc'd David to 
 get his feedback.

 On 4 May 2011, at 10:46, Dan Berindei wrote:

 On Wed, May 4, 2011 at 5:09 PM, Pete Muir pm...@redhat.com wrote:

 On 4 May 2011, at 05:34, Dan Berindei wrote:

 On Wed, May 4, 2011 at 10:14 AM, Galder Zamarreño gal...@redhat.com 
 wrote:

 On May 3, 2011, at 2:33 PM, Sanne Grinovero wrote:

 2011/5/3 이희승 (Trustin Lee) trus...@gmail.com:
 On 05/03/2011 05:08 AM, Sanne Grinovero wrote:
 2011/5/2 Manik Surtani ma...@jboss.org:

 On 1 May 2011, at 13:38, Pete Muir wrote:

 As in, user API?  That's a little intrusive... e.g., put(K, V, 
 cl) ?

 Not for put, since you have the class, just get, and I was thinking
 something more like:

 Foo foo = getUsing(key, Foo.class)

 This would be a pretty useful addition to the API anyway to avoid 
 user casts.

 Maybe as an advanced API, so as not to pollute the basic API?  A 
 bit like:

 Foo f = cache.getAdvancedCache().asClass(Foo.class).get(key);

 doesn't look much better than a cast, but is more cryptical :)

 getting back to the classloader issue, what about:

 Cache c = cacheManager.getCache( cacheName, classLoader );

 or
 Cache c = cacheManager.getCache( cacheName 
 ).usingClassLoader(classLoader );

 BTW if that's an issue on the API, maybe you should propose it to
 JSR-107 as well ?

 We have a configurable Marshaller, right?  Then why don't we just use
 the class loader that the current Marshaller uses?

 +1
 I like the clean approach, not sure how you configure the current
 Marshaller to use the correct CL ?
 Likely hard to do via configuration file

Re: [infinispan-dev] JGroupsDistSync and ISPN-83

2011-05-11 Thread Dan Berindei
On Wed, May 11, 2011 at 11:25 AM, Vladimir Blagojevic
vblag...@redhat.com wrote:
 The more I research ReentrantReadWriteLock the more shocked I am. Not
 only that a thread wanting to acquire write lock first has to release
 read lock, but we can block forever even if we release the read lock if
 we have acquired that read lock reentrantly. Each call to unlock just
 reduces the hold count, and the lock is only actually released when the
 hold count hits zero. Surreal!


If ReentrantReadWriteLock would allow upgrades then you would get a
deadlock when two threads both hold the read lock and try to upgrade
to a write lock at the same time.
There's always a trade-off...

I'm not familiar with the code, but are you sure the read lock is
being held by the same thread that is trying to acquire the write
lock?

Dan


 People have already debated this issue:
 http://stackoverflow.com/questions/464784/java-reentrantreadwritelocks-how-to-safely-acquire-write-lock

 In conlusion we have to seriously revisit all our uses of
 ReentrantReadWriteLock!

 Vladimir


 On 11-05-10 11:31 AM, Vladimir Blagojevic wrote:
 Hi,

 Would you please confirm my understanding and reading of javadoc for
 ReentrantReadWriteLock under section for lock downgrading. It says:
 Reentrancy also allows downgrading from the write lock to a read lock,
 by acquiring the write lock, then the read lock and then releasing the
 write lock. However, upgrading from a read lock to the write lock is not
 possible. ReentrantReadWriteLock javadoc code example with class
 CacheData also shows how a thread owning a read lock first has to
 release it in order to obtain a write lock! So a thread owning a read
 lock first has to release a read lock in order to obtain a write lock?

 This is very symptomatic in logs of ISPN-83 such as this one
 https://issues.jboss.org/secure/attachment/12341409/server1.log

 Recall how FLUSH stops all invocations on cluster and therefore all read
 lock acquisitions in JGroupsDistSync ultimately enabling smooth write
 lock acquisitions for state transfer and what not. In conclusion this
 leads me wondering if the culprit of all this FLUSH mess is rooted in
 read/write lock semantics from above?

 Regards,
 Vladimir




 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] https://issues.jboss.org/browse/ISPN-977

2011-05-11 Thread Dan Berindei
On Wed, May 11, 2011 at 5:51 PM, Mircea Markus mircea.mar...@jboss.com wrote:

 On 11 May 2011, at 13:57, Dan Berindei wrote:

 With the new push-based rebalancing code, we install the new CH before
 doing any rebalancing work, so the inconsistency window should be much
 smaller.
 In theory we also block new transactions, maybe you could use that to
 stop generating new keys while rehashing.
 that's sounds interesting, I'll take a look!

 On the other hand, in 5.0 you should be able to implement
 KeyAffinityService on top of Pete's grouping work for
 https://issues.jboss.org/browse/ISPN-312 and then you won't need any
 listeners.
 Not sure how could that work? KAS generates pseudo-random keys that map to a 
 certain nice. It's up to the user to place them in cache at some point, if 
 ever.

I was thinking of wrapping the keys generated by the user's
KeyGenerator together with a node address in our own MagicKey class
and set the @Group annotation on the address field.
We already have a MagicKey class for testing, the only problem with it
is that it only uses the address for hashing so you'd get lots of
collisions. With Pete's changes MagicKey becomes something we can
actually expose to the users (I was thinking through the
KeyAffinityService API).

I didn't think this through however, because MagicKeyK would have to
extend K in order to satisfy the KeyAffinityService API (and even to
be able to store the key in the CacheK,V later).

Your question does raise an interesting point, though: a new node can
join between the user call to
KeyAffinityService.getKeyForAddress(address) and
Cache.put(generatedKey), making generatedKey map to the new node. So
there is no way for us to completely eliminate the issue of stale
keys.



 Dan


 On Wed, May 11, 2011 at 3:29 PM, Sanne Grinovero
 sanne.grinov...@gmail.com wrote:
 First thing I thought when reading your email was OMG do we support
 on-the-fly hash implementation changes? crazy!

 That's obviously not the case, but if you name it as
 @ConsistenHashChangeListene that's what I would think.

 Wouldn't it be better to change the exact timing of the viewchange
 event? I don't see why Infinispan users might be more interested in
 knowing about the topology details according to the transport than
 what they are about the actual Infinispan hashing topology - I would
 expect that when I receive which notification the new view is already
 installed and ready to go; actually I thought that was the case since
 ever.

 What would be the use cases to get the notification *before* the new
 hash is installed?

 Cheers,
 Sanne

 2011/5/11 Mircea Markus mircea.mar...@jboss.com:
 Hi,
 The basic problem behind this is that I need to be notified when a new
 consistent hash is installed.

 ATM there isn't any support (of which I know)  for  a
 @ConsistenHashChangeListener.

 I'm thinking to add such notifications either:
 a) internally: Observer pattern on DistributionManager or even on
 DistributionManagerImpl
 b) more generically, as a fully flagged listener.

 I favor a) and then if more people ask for it we will expose it as a fully
 flagged listener.

 Suggestions?

 Cheers,
 Mircea

 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] IDEA:building non-OSGi libraries...

2011-05-12 Thread Dan Berindei
I disabled the Osmorc plugin and that seemed to stop it.

Dan


On Wed, May 11, 2011 at 8:50 PM, Mircea Markus mircea.mar...@jboss.com wrote:
 Hi,

 Whenever I'm switching branches from 4.2.x to master the build in IDEA takes 
 ages(even more, about 5 mins!!!). Most of the time is spent is Building 
 non-OSGi libraries for module...
 Any idea on what can be done to speed things up?

 Cheers,
 Mircea
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] [Pull Request] Modular Classloading Compatibility

2011-05-12 Thread Dan Berindei
On Wed, May 11, 2011 at 11:18 PM, David Bosschaert da...@redhat.com wrote:
 On 11/05/2011 17:54, Dan Berindei wrote:
 On Wed, May 11, 2011 at 7:08 PM, Pete Muirpm...@redhat.com  wrote:
 Were we developing for OSGi I would certainly agree with you. However in 
 many environments today we can reasonably expect the TCCL to be set and to 
 be able to load the classes we need. So whilst making it part of the API is 
 the safest option, it's also making complicated an API for the sake of the 
 few at the cost of the many. Further this also seems kinda nasty to me. We 
 know the class (and hence bundle/module) when we put the object into 
 Infinispan, therefore why do we require people to respecify this again?

 David, can we not actually do something here akin to what we are discussing 
 for Weld? Whereby we can serialize out the bundle id and then find the 
 correct CL based on that when we deserialize.
 What if the object is a java.util.ArrayList? Each element in the list
 could belong to a different bundle, so you'd have to write a bundle id
 for every element in the list.
 Yes, if you know the Bundle-SymbolicName and Version (or the Bundle ID)
 you can find its classloader.

 On the other question, if you're passing in a class object then you can
 obtain its classloader and hence the bundle where it came from. But, and
 I think this is what Dan allused to above, is it always true that the
 class your passing in comes from the bundle that you need to have or
 could it also come from one of its parent class loaders?


Exactly David, sorry if my message was a little cryptic. I think in
order to handle every case properly you would have to go through the
entire object graph being stored in the cache in order to find all the
classloaders/bundle ids that you will need on get().

That seems like a lot of overhead to me, and forcing the user to
provide the classloader doesn't seem that bad in comparison. Perhaps
we should use something other than a thread-local for this though, so
that users can do a onto the result of a
cacheManager.getCache(A).usingClassLoader(A.class) and never have to
provide the classloader again.

In fact I think this is a good idea for the invocation flags we
already have, too. It would involve creating lots of overloads in
CacheDelegate with a PreInvocationContext parameter and a new
CacheDelegateWithContext class to invoke those methods, but the public
API would remain the same.


Cheers
Dan

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] [Pull Request] Modular Classloading Compatibility

2011-05-12 Thread Dan Berindei
On Thu, May 12, 2011 at 1:22 PM, Emmanuel Bernard
emman...@hibernate.org wrote:

 On 12 mai 2011, at 11:18, Dan Berindei wrote:

 That seems like a lot of overhead to me, and forcing the user to
 provide the classloader doesn't seem that bad in comparison. Perhaps
 we should use something other than a thread-local for this though, so
 that users can do a onto the result of a
 cacheManager.getCache(A).usingClassLoader(A.class) and never have to
 provide the classloader again.

 Note that I don't think it works in the case you mentioned earlier ie a 
 ArrayList with classes from many different bundles.


It works, only now it's the user's job to provide the right classloader.

I suspect most people will have the same situation as Augusto, the guy
who was asking about classloading rules on the forum
(http://community.jboss.org/message/602759): they already have a
bundle that has access to the classes that will ever be in the cache
(in Augusto's case via DynamicImport-Package: *), their only problem
is convincing Infinispan to use that bundle's classloader for
unmarshalling.

If they don't like the idea of having a catch-all bundle, they can
instead write a bridge classloader themselves that holds a list of
classloaders and loads classes from any of them. They can even write
the bundle ids in the cache if the list of bundles can only be
determined at runtime.

Of course, Augusto's bundles don't access each other's objects in the
cache, so his problem could also be solved by giving each bundle it's
own CacheDelegateWithContext (or CacheDelegateSession, if you will)
instance configured with that bundle's classloader, removing the need
for the DynamicImport-Package directive in his central bundle.

Cheers
Dan
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] https://issues.jboss.org/browse/ISPN-977

2011-05-12 Thread Dan Berindei
On Wed, May 11, 2011 at 8:47 PM, Erik Salter esal...@bnivideo.com wrote:
 Wouldn't any rehash affect the locality of these generated keys, or am I 
 missing something?


Indeed, the moment a new node joins the cluster you have no guarantees
that the keys that used to be on the same node are still stored
together.

Cheers
Dan



 -Original Message-
 From: infinispan-dev-boun...@lists.jboss.org 
 [mailto:infinispan-dev-boun...@lists.jboss.org] On Behalf Of Dan Berindei
 Sent: Wednesday, May 11, 2011 12:47 PM
 To: infinispan -Dev List
 Subject: Re: [infinispan-dev] https://issues.jboss.org/browse/ISPN-977

 On Wed, May 11, 2011 at 5:51 PM, Mircea Markus mircea.mar...@jboss.com 
 wrote:

 On 11 May 2011, at 13:57, Dan Berindei wrote:

 With the new push-based rebalancing code, we install the new CH
 before doing any rebalancing work, so the inconsistency window should
 be much smaller.
 In theory we also block new transactions, maybe you could use that to
 stop generating new keys while rehashing.
 that's sounds interesting, I'll take a look!

 On the other hand, in 5.0 you should be able to implement
 KeyAffinityService on top of Pete's grouping work for
 https://issues.jboss.org/browse/ISPN-312 and then you won't need any
 listeners.
 Not sure how could that work? KAS generates pseudo-random keys that map to a 
 certain nice. It's up to the user to place them in cache at some point, if 
 ever.

 I was thinking of wrapping the keys generated by the user's KeyGenerator 
 together with a node address in our own MagicKey class and set the @Group 
 annotation on the address field.
 We already have a MagicKey class for testing, the only problem with it is 
 that it only uses the address for hashing so you'd get lots of collisions. 
 With Pete's changes MagicKey becomes something we can actually expose to the 
 users (I was thinking through the KeyAffinityService API).

 I didn't think this through however, because MagicKeyK would have to extend 
 K in order to satisfy the KeyAffinityService API (and even to be able to 
 store the key in the CacheK,V later).

 Your question does raise an interesting point, though: a new node can join 
 between the user call to
 KeyAffinityService.getKeyForAddress(address) and Cache.put(generatedKey), 
 making generatedKey map to the new node. So there is no way for us to 
 completely eliminate the issue of stale
 keys.



 Dan


 On Wed, May 11, 2011 at 3:29 PM, Sanne Grinovero
 sanne.grinov...@gmail.com wrote:
 First thing I thought when reading your email was OMG do we support
 on-the-fly hash implementation changes? crazy!

 That's obviously not the case, but if you name it as
 @ConsistenHashChangeListene that's what I would think.

 Wouldn't it be better to change the exact timing of the viewchange
 event? I don't see why Infinispan users might be more interested in
 knowing about the topology details according to the transport than
 what they are about the actual Infinispan hashing topology - I would
 expect that when I receive which notification the new view is
 already installed and ready to go; actually I thought that was the
 case since ever.

 What would be the use cases to get the notification *before* the new
 hash is installed?

 Cheers,
 Sanne

 2011/5/11 Mircea Markus mircea.mar...@jboss.com:
 Hi,
 The basic problem behind this is that I need to be notified when a
 new consistent hash is installed.

 ATM there isn't any support (of which I know)  for  a
 @ConsistenHashChangeListener.

 I'm thinking to add such notifications either:
 a) internally: Observer pattern on DistributionManager or even on
 DistributionManagerImpl
 b) more generically, as a fully flagged listener.

 I favor a) and then if more people ask for it we will expose it as
 a fully flagged listener.

 Suggestions?

 Cheers,
 Mircea

 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 The information contained in this message is legally privileged and 
 confidential, and is intended for the individual or entity to whom it is 
 addressed (or their designee). If this message is read by anyone other than 
 the intended recipient, please be advised that distribution of this message, 
 in any form, is strictly prohibited

Re: [infinispan-dev] Generated keys affected by rehash Was: https://issues.jboss.org/browse/ISPN-977

2011-05-13 Thread Dan Berindei
On Fri, May 13, 2011 at 6:38 PM, Erik Salter esal...@bnivideo.com wrote:
 Yes, collocation of all keys is a large concern of my application(s).

 Currently, I can handle keys I'm in control of (like database-generated 
 keys), where I can play around with the hash code.   What I would love to do 
 is collocate that data with keys I can't control (like UUIDs) so that all 
 cache operations can be done in the same transaction on the data owner's node.


By playing around with the hash code do you mean you set the hashcode
for all the keys you want on a certain server to the same value? I
imagine that would degrade performance quite a bit, because we have
HashMaps everywhere and your keys will always end up in the same hash
bucket.

ISPN-312 seems to me like a much better fit for your use case than the
KeyAffinityService. Even if you added a listener to change your keys
when the topology changes, that would mean on a rehash the keys could
get moved to the new server and then back to the old server, whereas
with ISPN-312 they will either all stay on the old node or they will
all move to the new node.

Cheers
Dan


 Erik

 -Original Message-
 From: infinispan-dev-boun...@lists.jboss.org 
 [mailto:infinispan-dev-boun...@lists.jboss.org] On Behalf Of Manik Surtani
 Sent: Friday, May 13, 2011 5:25 AM
 To: infinispan -Dev List
 Subject: [infinispan-dev] Generated keys affected by rehash Was: 
 https://issues.jboss.org/browse/ISPN-977


 On 11 May 2011, at 18:47, Erik Salter wrote:

 Wouldn't any rehash affect the locality of these generated keys, or am I 
 missing something?

 It would.  And hence ISPN-977, to address that.  Or is your concern keys 
 already generated before the rehash?  The latter would certainly be a 
 problem.  Perhaps, if this was important to the application, on detecting a 
 change in topology, re-generate keys and move data around?  For other apps, 
 move the session to the appropriate node?

 Cheers
 Manik
 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org




 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 The information contained in this message is legally privileged and 
 confidential, and is intended for the individual or entity to whom it is 
 addressed (or their designee). If this message is read by anyone other than 
 the intended recipient, please be advised that distribution of this message, 
 in any form, is strictly prohibited. If you have received this message in 
 error, please notify the sender immediately and delete or destroy all copies 
 of this message.

 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] [Pull Request] Modular Classloading Compatibility

2011-05-13 Thread Dan Berindei
On Fri, May 13, 2011 at 5:48 PM, Jason T. Greene
jason.gre...@redhat.com wrote:
 On 5/12/11 4:18 AM, Dan Berindei wrote:

 On Wed, May 11, 2011 at 11:18 PM, David Bosschaertda...@redhat.com
  wrote:

 On 11/05/2011 17:54, Dan Berindei wrote:

 On Wed, May 11, 2011 at 7:08 PM, Pete Muirpm...@redhat.com    wrote:

 Were we developing for OSGi I would certainly agree with you. However
 in many environments today we can reasonably expect the TCCL to be set and
 to be able to load the classes we need. So whilst making it part of the 
 API
 is the safest option, it's also making complicated an API for the sake of
 the few at the cost of the many. Further this also seems kinda nasty to 
 me.
 We know the class (and hence bundle/module) when we put the object into
 Infinispan, therefore why do we require people to respecify this again?

 David, can we not actually do something here akin to what we are
 discussing for Weld? Whereby we can serialize out the bundle id and then
 find the correct CL based on that when we deserialize.

 What if the object is a java.util.ArrayList? Each element in the list
 could belong to a different bundle, so you'd have to write a bundle id
 for every element in the list.

 Yes, if you know the Bundle-SymbolicName and Version (or the Bundle ID)
 you can find its classloader.

 On the other question, if you're passing in a class object then you can
 obtain its classloader and hence the bundle where it came from. But, and
 I think this is what Dan allused to above, is it always true that the
 class your passing in comes from the bundle that you need to have or
 could it also come from one of its parent class loaders?


 Exactly David, sorry if my message was a little cryptic. I think in
 order to handle every case properly you would have to go through the
 entire object graph being stored in the cache in order to find all the
 classloaders/bundle ids that you will need on get().

 This approach just doesn't scale, and it would not work in all environments
 (there is no gaurantee you can get the original classloader if the
 environment doesn't allow you to look them up by some kind of unique id).

 That seems like a lot of overhead to me, and forcing the user to
 provide the classloader doesn't seem that bad in comparison. Perhaps
 we should use something other than a thread-local for this though

 Yes thread locals are brittle, and tend to leak if they aren't managed
 correctly. Using a context specific instance is a much better approach.

 so
 that users can do a onto the result of a
 cacheManager.getCache(A).usingClassLoader(A.class) and never have to
 provide the classloader again.


 That's similar to what I was proposing with the CacheSession notion. The
 basic notion is that you move thread/context specific state to a separate
 object that the caller uses (optionaly), and it mirrors the same Cache API.


The way I understood your proposal was that the session would be
something visible to the user, so he would have to call
cacheManager.getCache(A).getSession() in order to access the context
functionality.
My version essentially the same thing, except the session would be
created transparently when the user calls usingClassLoader() or
withFlags().

 In fact I think this is a good idea for the invocation flags we
 already have, too. It would involve creating lots of overloads in
 CacheDelegate with a PreInvocationContext parameter and a new
 CacheDelegateWithContext class to invoke those methods, but the public
 API would remain the same.

 You dont need to reuse the same impl. Just create a new delegate which
 passes an extra invocation state parameter with ever call. The overhead of
 that is tiny.


Some of the methods in CacheDelegate do have lots of logic in them, so
we have to reuse them. The way I see it the existing methods will
change to receive the context parameter explicitly instead of
implicitly, and the methods with the old signature will call them with
a default context. usingClassLoader/withFlags will create an instance
of CacheDelegateSession, and the methods in CacheDelegateSession will
modify/use the context stored in that instance.

I agree the overhead is tiny, and the best thing is users can optimize
it away by moving the usingClassLoader/withFlags calls outside of
loops.

Cheers
Dan

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


[infinispan-dev] Compilation errors in the infinispan-spring module

2011-05-16 Thread Dan Berindei
Hi Olaf,

Did you see any problems with RHQ + Spring interaction that determined
you to exclude the rhq-pluginAnnotations dependency in the spring
module?

  dependency
 groupId${project.groupId}/groupId
 artifactIdinfinispan-core/artifactId
 version${project.version}/version
 scopecompile/scope
 exclusions
exclusion
   groupIdorg.rhq.helpers/groupId
   artifactIdrhq-pluginAnnotations/artifactId
/exclusion
 /exclusions
  /dependency


I've been getting some weird errors while building the
infinispan-spring module, both with OpenJDK 1.6.0_20 and with Sun JDK
1.6.0_24 and 1.6.0_25, and they seem to appear because the compiler
doesn't have access to the RHQ annotations:

/tmp/privatebuild/home/dan/Work/infinispan/master/core/classes/org/infinispan/manager/DefaultCacheManager.class:
warning: Cannot find annotation method 'displayName()' in type
'org.rhq.helpers.pluginAnnotations.agent.Metric': class file for
org.rhq.helpers.pluginAnnotations.agent.Metric not found
/tmp/privatebuild/home/dan/Work/infinispan/master/core/classes/org/infinispan/manager/DefaultCacheManager.class:
warning: Cannot find annotation method 'dataType()' in type
'org.rhq.helpers.pluginAnnotations.agent.Metric'
An exception has occurred in the compiler (1.6.0_24). Please file a
bug at the Java Developer Connection
(http://java.sun.com/webapps/bugreport)  after checking the Bug Parade
for duplicates. Include your program and the following diagnostic in
your report.  Thank you.
com.sun.tools.javac.code.Symbol$CompletionFailure: class file for
org.rhq.helpers.pluginAnnotations.agent.DataType not found

Galder has seen it too with Sun JDK 1.6.0_24, but strangely enough
everyone else is able to build without any errors.

I'm thinking of removing the rhq-pluginAnnotations exclusion from the
infinispan-spring pom.xml, the question is whether this would break
something on the Spring side. Do you know of any potential problems,
or did you do this just to reduce the number of dependencies brought
in by infinispan-spring into an application?


Cheers
Dan
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] [Pull Request] Modular Classloading Compatibility

2011-05-16 Thread Dan Berindei
Not to be the guy that breaks the circle, how about the threads that
handle incoming RPCs from other nodes? With storeAsBinary=false
those threads have to unmarshal the objects in order to store them on
the local node, so we'd need a way to register a classloader with the
cache, not just with the cache session.

If we use a single cache for all Hibernate entities, regardless of
application, this means the cache could receive application objects
from the other nodes without the application even being deployed. Does
this mean we'll require storeAsBinary=true in all Hibernate 2LC
configurations?

Dan


On Mon, May 16, 2011 at 9:54 PM, David M. Lloyd david.ll...@redhat.com wrote:
 I know, we can just attach the class loader to the cache!

 Okay, just kidding, but Galder is right, this conversation is going in
 circles.  We already discussed that in this thread and a number of
 points were raised for and against.

 On 05/16/2011 01:20 PM, Adrian Cole wrote:
 What about a helper that just returns a cache with a specific
 classloader from a cache?

 cache.withLoader(cl).get(K key)

 -a


 On Mon, May 16, 2011 at 11:14 AM, Galder Zamarreño gal...@redhat.com
 mailto:gal...@redhat.com wrote:


     On May 16, 2011, at 7:57 PM, Sanne Grinovero wrote:

       I don't like the TCCL either, so I'll repeat my suggestion from two
       weeks ago to just have:
      
       Cache c = cacheManager.getCache( cacheName, classLoader );
      
       sounds reasonable to me to have the application declare it's
     intentions once ?
      
       BTW I don't like
      
       cache.get(K key, ClassV clazz)
      
       as we're not speaking only about the get(K) method, but about many
       methods and this will explode the number of method of Cache; on the
       other hand I think it;s acceptable to have a single Cache instance
       used by a single application/classloader. You can still have multiple
       applications share the same underlying cache and use different
       classloaders:

     Guys, we're going around in circles. As I said the other week, you
     can't assume 1 cache = 1 classloader cos for example in the
     Hibernate 2LC all entities will be stored in a single cache as
     opposed to today where we have a cache per entity. And if all
     entities are stored in the same cache, we potentially have a cache
     that contains data belonging to multiple cache loaders. And the
     reason for all this is cos we don't support asymmetric clusters.

     Could someone start a design wiki to grab all the requirements?

      
       getCache( cacheName, classLoader ) would return a delegate to the
       original cache, having a specific marshaller in the invocation
     context
       as Trustin was suggesting.
      
       Cheers,
       Sanne
      
      
       2011/5/16 Pete Muir pm...@redhat.com mailto:pm...@redhat.com:
      
       On 16 May 2011, at 18:20, Galder Zamarreño wrote:
      
      
       On May 12, 2011, at 11:18 AM, Dan Berindei wrote:
      
       On Wed, May 11, 2011 at 11:18 PM, David Bosschaert
     da...@redhat.com mailto:da...@redhat.com wrote:
       On 11/05/2011 17:54, Dan Berindei wrote:
       On Wed, May 11, 2011 at 7:08 PM, Pete Muirpm...@redhat.com
     mailto:pm...@redhat.com  wrote:
       Were we developing for OSGi I would certainly agree with
     you. However in many environments today we can reasonably expect the
     TCCL to be set and to be able to load the classes we need. So whilst
     making it part of the API is the safest option, it's also making
     complicated an API for the sake of the few at the cost of the many.
     Further this also seems kinda nasty to me. We know the class (and
     hence bundle/module) when we put the object into Infinispan,
     therefore why do we require people to respecify this again?
      
       David, can we not actually do something here akin to what
     we are discussing for Weld? Whereby we can serialize out the bundle
     id and then find the correct CL based on that when we deserialize.
       What if the object is a java.util.ArrayList? Each element in
     the list
       could belong to a different bundle, so you'd have to write a
     bundle id
       for every element in the list.
       Yes, if you know the Bundle-SymbolicName and Version (or the
     Bundle ID)
       you can find its classloader.
      
       On the other question, if you're passing in a class object
     then you can
       obtain its classloader and hence the bundle where it came
     from. But, and
       I think this is what Dan allused to above, is it always true
     that the
       class your passing in comes from the bundle that you need to
     have or
       could it also come from one of its parent class loaders?
      
      
       Exactly David, sorry if my message was a little cryptic. I
     think in
       order to handle every case properly you would have to go
     through the
       entire object graph being

Re: [infinispan-dev] [Pull Request] Modular Classloading Compatibility

2011-05-19 Thread Dan Berindei
On Wed, May 18, 2011 at 7:06 PM, Manik Surtani ma...@jboss.org wrote:
 Hi guys

 Sorry I've been absent from this thread for a while now (it's been growing 
 faster than I've been able to deal with email backlog!)

 Anyway, this is a very interesting discussion.  To summarise - as Pete did at 
 some point - there are 2 goals here:

 1.  Safe and intuitive use of an appropriate classloader
 2.  Safe type system for return values.

 I think the far more pressing concern is (1) so I'd like to focus on that.  
 If we think (2) is pressing enough a concern, we should spawn a separate 
 thread and discuss there.

 So, onto the issue of safe classloading.

 1) Class loader per session/cache.

 I like Jason/Sanne/Trustin's suggestions of a session-like contract, and 
 specifically I think this is best achieved as a delegate to a cache, again as 
 suggested elsewhere by Pete, etc.  E.g.,

        Cache?, ? myCache = cacheManager.getCache(myCache, myClassLoader);

 and what is returned is something that delegates to the actual cache, making 
 sure the TCCL is set and re-set appropriately.  The handle to the cache is 
 effectively your session and each webapp, etc in an EE environment will 
 have its own handle.  I propose using the TCCL as an internal implementation 
 detail within this delegate, helps with making sure it is carefully managed 
 and cleaned up while not re-engineering loads of internals.


I like the API but I would not recommend using the TCCL for this. I
was able to get a nice perf jump in the HotRod client by skipping 2
Thread.setContextClassLoader() calls on each cache operation (1 to set
the TCCL we wanted and 1 to restore the original TCCL). Setting the
TCCL is a privileged operation, so it has to go through a
SecurityManager and that is very slow.


 I think EmbeddedCacheManager.getCache(String name, ClassLoader cl) is enough 
 ... it is clear enough, and I don't see the need for overloaded 
 getCache(name, classOfWhichClassLoaderIWishToUse).


I agree, a Cache.usingClassloader(classOfWhichClassLoaderIWishToUse)
overload would have made sense because the method name already
communicates the intention, but a getCache(name, clazz) overload is
too obscure.


 2) Class loader per invocation.

 I've been less than happy with this, not just because it pollutes the API but 
 that it adds a layer of confusion.  If all use cases discussed can be solved 
 with (1) above, then I'd prefer to just do that.

 The way I see it, most user apps directly using Infinispan would only be 
 exposed to a single class loader per cache reference (even if multiple 
 references talk to the same cache).

 Frameworks, OTOH, are a bit tougher, Hibernate being a good example on this 
 thread.  So this is a question for Galder - is it feasible to maintain 
 several references to a cache, one for each app/persistence unit?

 3) Can all OSGi requirements be handled by (1)? I would guess so, from what I 
 have read here, since the class loader is explicitly passed in when getting a 
 handle on a cache.


Yes, the only difference is that OSGi goesn't make any guarantees
about the TCCL, so passing the classloader explicitly will work in all
environments. However,

4) What about storeAsBinary=false? Threads processing requests from
other nodes are not associated with any CacheManager.getCache(name,
classloader) call, and they also have to unmarshall values with this
setting.

Since Hibernate already mandates storeAsBinary=true for its 2LC, we
can probably get away with only supporting one classloader per cache
in the storeAsBinary=false case.

Still, we can't rely on the TCCL of the background threads because
those threads are shared between all the caches in a CacheManager. In
fact we should probably set the TCCL to null for all background
threads created by Infinispan, or we risk keeping the classes of the
first application/bundle that called us alive as long as those threads
are still running.

Dan


 Cheers
 Manik


 On 27 Apr 2011, at 17:39, Jason T. Greene wrote:

 Available here:
 https://github.com/infinispan/infinispan/pull/278

 The problem is basically that infinispan currently is using TCCL for all
 class loading and resource loading. This has a lot of problems in
 modular containers (OSGi, AS7, etc), where you dont have framework
 implementation classes on the same classloader as user classes (that is
 how they achieve true isolation)

 You can read about this in more detail here:
 http://community.jboss.org/wiki/ModuleCompatibleClassloadingGuide

 The patch in the pull request is a first step, and should fix many
 issues, but it does not address all (there is still a lot of TCCL usage
 spread out among cacheloaders and so on), and ultimately it's just a
 work around. It should, however, be compatible in any other non-modular
 environment.

 Really the ultimate solution is to setup a proper demarcation between
 what the user is supposed to provide, and what is expected to be bundled
 with infinispan. Whenever there 

Re: [infinispan-dev] [Pull Request] Modular Classloading Compatibility

2011-05-19 Thread Dan Berindei
On Thu, May 19, 2011 at 1:31 PM, Galder Zamarreño gal...@redhat.com wrote:

 On May 19, 2011, at 12:11 PM, Manik Surtani wrote:

 Guys - what are we talking about?  Specifying ClassLoaders is only 
 meaningful if storeAsBinary is set to true.

 Ok, that was not clear to me throughout the discussion.


That wasn't clear for me either. And we have at least one user trying
storeAsBinary==true because it has much better performance
(http://community.jboss.org/message/604831). Maybe the performance
difference isn't so great any more after the latest commits, but I'm
sure there is still a difference.

I'm not suggesting that we absolutely need to make storeAsBinary==true
work with multiple classloaders per cache or even per cache manager,
but we should support an alternative for OSGi users who want to use
storeAsBinary==true. That means at least having a clear way to specify
in one classloader per cache manager.

Note that this will also help us in the storeAsBinary==false case,
because then we will be able to set the bootstrap classloader as the
cache manager classloader and so will be able to avoid the cache
manager threads hanging on to an undeployed application's classes.


 In general, any situation where you have code booted off different 
 ClassLoaders running in the same JVM and sharing the same cache (or cache 
 manager), you would *need* to set storeAsBinary to true to get around 
 deserialization issues on remote nodes.

 StoreAsBinary = false only really works for trivial cases where caches/cache 
 managers run in environments where only one cache loader is in effect.  
 I.e., *not* JBoss/Java EE/Hibernate/OSGi/etc.  This is one of the reasons 
 why we considered setting storeAsBinary to true by default (and we see 
 similar techniques in competing data grids).


We already have an example of a guy who managed to make it work, we
just need to specify the behaviour of Infinispan so that he can rely
on it working.

Cheers
Dan


 This is clear now, thanks.


 Cheers
 Manik


 On 19 May 2011, at 10:55, Galder Zamarreño wrote:

 would be different cache instances. The problem then is that if an RPC 
 comes for entities cache and entity P1, which of the entities caches do 
 I go for? You'd need to know which classloader P1 is living in the remote 
 node and you'd have to now that at the Infinispan level to be able to store 
 it in a non-binary format.

 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org




 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] chunking ability on the JDBC cacheloader

2011-05-22 Thread Dan Berindei
On Mon, May 23, 2011 at 7:04 AM, 이희승 (Trustin Lee) trus...@gmail.com wrote:
 On 05/20/2011 03:54 PM, Manik Surtani wrote:
 Is spanning rows the only real solution?  As you say it would mandate using 
 transactions to keep multiple rows coherent, and 'm not sure if everyone 
 would want to enable transactions for this.

 There are more hidden overheads.  To update a value, the cache store
 must determine how many chunks already exists in the cache store and
 selectively delete and update them.  To simply aggressively, we could
 delete all chunks and insert new chunks.  Both at the cost of great
 overhead.

 Even MySQL supports a blog up to 4GiB, so I think it's better update the
 schema?


+1

BLOBs are only stored in external storage if the actual data can't fit
in a normal table row, so the only penalty in using a LONGBLOB
compared to a VARBINARY(255) is 3 extra bytes for the length.

If the user really wants to use a data type with a smaller max length,
we can just report an error when the data column size is too small. We
will need to check the length and throw an exception ourselves though,
with MySQL we can't be sure that it is configured to raise errors when
a value is truncated.

Cheers
Dan


 On 19 May 2011, at 19:06, Sanne Grinovero wrote:

 As mentioned on the user forum [1], people setting up a JDBC
 cacheloader need to be able to define the size of columns to be used.
 The Lucene Directory has a feature to autonomously chunk the segment
 contents at a configurable specified byte number, and so has the
 GridFS; still there are other metadata objects which Lucene currently
 doesn't chunk as it's fairly small (but undefined and possibly
 growing), and in a more general sense anybody using the JDBC
 cacheloader would face the same problem: what's the dimension I need
 to use ?

 While in most cases the maximum size can be estimated, this is still
 not good enough, as when you're wrong the byte array might get
 truncated, so I think the CacheLoader should take care of this.

 what would you think of:
 - adding a max_chunk_size option to JdbcStringBasedCacheStoreConfig
 and JdbcBinaryCacheStore
 - have them store in multiple rows the values which would be bigger
 than max_chunk_size
 - this will need transactions, which are currently not being used by
 the cacheloaders

 It looks like to me that only the JDBC cacheloader has these issues,
 as the other stores I'm aware of are more blob oriented. Could it be
 worth to build this abstraction in an higher level instead of in the
 JDBC cacheloader?

 Cheers,
 Sanne

 [1] - http://community.jboss.org/thread/166760
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org




 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


 --
 Trustin Lee, http://gleamynode.net/
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102

2011-05-23 Thread Dan Berindei
Hi Galder

Sorry I'm replying so late

On Thu, May 19, 2011 at 2:02 PM, Galder Zamarreño gal...@redhat.com wrote:
 Hi all,

 Re: https://issues.jboss.org/browse/ISPN-1102

 First of all thanks to Dan for his suggestion on reservoir 
 sampling+percentiles, very good suggestion:). So, I'm looking into this and 
 Trustin's 
 http://docs.jboss.org/netty/3.2/api/org/jboss/netty/channel/AdaptiveReceiveBufferSizePredictor.html
  but in this email I wanted to discuss the reservoir sampling mechanism 
 (http://en.wikipedia.org/wiki/Reservoir_sampling).

 So, the way I look at it, to implement this you'd keep track of N buffer 
 sizes used so far, and out of those chose K samples based on reservoir 
 sampling, and then of those K samples left, take the 90th percentile.

 Calculating the percentile is easy with those K samples stored in an ordered 
 collection. Now, my problem with this is that reservoir sampling is an O(n) 
 operation and you would not want to be doing that per each request for a 
 buffer that comes in.


The way I see it the R algorithm is only O(n) because you have to do a
random(1, i) call for each request i.
Because random(1, i) tends to return bigger and bigger values with
every request, the probability of actually modifying the collection of
samples becomes smaller and smaller. So it would be ok to compute the
buffer size estimate on every modification because those modifications
are very rare after startup.

 One option I can think of that instead of ever letting a user thread 
 calculate this, the user thread could just feed the buffer size collection (a 
 concurrent collection) and we could have a thread in the background that 
 periodically or based on some threshold calculates the reservoir sample + 
 percentile and this is what's used as next buffer size. My biggest problem 
 here is the concurrent collection in the middle. You could have a priority 
 queue ordered by buffer sizes but it's unbounded. The concurrent collection 
 does not require to be ordered though, the reservoir sampling could do that, 
 but you want it the collection bounded. But if bounded and the limit is hit, 
 you would not want it to block but instead override values remove the last 
 element and insert again. You only care about the last N relevant buffer 
 sizes...


Based on my assumption that modifications become very rare after
startup I was thinking of using a fixed size array with a synchronized
block around each access (we'd probably need synchronization for the
Random, too). Note that you must use an array or a list, because you
have to replace the element random(1, i) and not the
smallest/greatest/something else.

 Another option that would avoid the use of a concurrent collection would be 
 if this was calculated per thread and stored in a thread local. The 
 calculation could be done every X requests still in the client thread, or 
 could be sent to a separate thread wrapping it around a callable and keeping 
 the future as thread local, you could query it next time the thread wants to 
 marshall something.


True, per thread statistics could remove the need of synchronization
completely. But using a separate thread for doing the computations
would require synchronization again, and I don't think there's enough
work to be done to justify it.

 I feel a bit more inclined towards the latter option although it limits the 
 calculation to be per-thread for several reasons:
 - We already have org.jboss.marshalling.Marshaller and 
 org.jboss.marshalling.Unmarshaller instances as thread local which have 
 proven to perform well.
 - So we could tap into this set up to maintain not only the marshaller, but 
 the size of the buffer too.
 - It could offer the possibility of being extended further to avoid creating 
 buffers all the time and instead reuse them as long as the size is constant. 
 After a size recalculation we'd ditch it and create a new one.


I totally agree, combining adaptive size with buffer reuse would be
really cool. I imagine when passing the buffer to JGroups we'd still
make an arraycopy, but we'd get rid of a lot of arraycopy calls to
resize the buffer when the average object size is  500 bytes. At the
same time, if a small percentage of the objects are much bigger than
the rest, we wouldn't reuse those huge buffers so we wouldn't waste
too much memory.

 Thoughts?
 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102

2011-05-23 Thread Dan Berindei
On Mon, May 23, 2011 at 7:44 PM, Sanne Grinovero
sanne.grinov...@gmail.com wrote:
 To keep stuff simple, I'd add an alternative feature instead:
 have the custom externalizers to optionally recommend an allocation buffer 
 size.

 In my experience people use a set of well known types for the key, and
 maybe for the value as well, for which they actually know the output
 byte size, so there's no point in Infinispan to try guessing the size
 and then adapting on it; an exception being the often used Strings,
 even in composite keys, but again as user of the API I have a pretty
 good idea of the size I'm going to need, for each object I store.


Excellent idea, if the custom externalizer can give us the exact size
of the serialized object we wouldn't need to do any guesswork.
I'm a bit worried about over-zealous externalizers that will spend
just as much computing the size of a complex object graph as they
spend on actually serializing the whole thing, but as long as our
internal externalizers are good examples I think we're ok.

Big plus: we could use the size of the serialized object to estimate
the memory usage of each cache entry, so maybe with this we could
finally constrain the cache to use a fixed amount of memory :)


 Also in MarshalledValue I see that an ExposedByteArrayOutputStream is
 created, and after serialization if the buffer is found to be bigger
 than the buffer we're referencing a copy is made to create an exact
 matching byte[].
 What about revamping the interface there, to expose the
 ExposedByteArrayOutputStream instead of byte[], up to the JGroups
 level?

 In case the value is not stored in binary form, the expected life of
 the stream is very short anyway, after being pushed directly to
 network buffers we don't need it anymore... couldn't we pass the
 non-truncated stream directly to JGroups without this final size
 adjustement ?

 Of course when values are stored in binary form it might make sense to
 save some memory, but again if that was an option I'd make use of it;
 in case of Lucene I can guess the size with a very good estimate (some
 bytes off), compared to buffer sizes of potentially many megabytes
 which I'd prefer to avoid copying - especially not interested in it to
 safe 2 bytes even if I where to store values in binary.


Yeah, but ExposedByteArrayOutputStream grows by 100% percent, so if
you're off by 1 in your size estimate you'll waste 50% of the memory
by keeping that buffer around.

Even if your estimate is perfect you're still wasting at least 32
bytes on a 64-bit machine: 16 bytes for the buffer object header + 8
for the array reference + 4 (rounded up to 8) for the count, though I
guess you could get that down to 4 bytes by keeping the byte[] and
count as members of MarshalledValue.

Besides, for Lucene couldn't you store the actual data separately as a
byte[] so that Infinispan doesn't wrap it in a MarshalledValue?

 Then if we just keep the ExposedByteArrayOutputStream around in the
 MarshalledValue, we could save some copying by replacing the
 output.write(raw) in writeObject(ObjectOutput output,
 MarshalledValue mv) by using a
 output.write( byte[] , offset, length );

 Cheers,
 Sanne


 2011/5/23 Bela Ban b...@redhat.com:


 On 5/23/11 6:15 PM, Dan Berindei wrote:

 I totally agree, combining adaptive size with buffer reuse would be
 really cool. I imagine when passing the buffer to JGroups we'd still
 make an arraycopy, but we'd get rid of a lot of arraycopy calls to
 resize the buffer when the average object size is  500 bytes. At the
 same time, if a small percentage of the objects are much bigger than
 the rest, we wouldn't reuse those huge buffers so we wouldn't waste
 too much memory.


  From my experience, reusing and syncing on a buffer will be slower than
 making a simple arraycopy. I used to reuse buffers in JGroups, but got
 better perf when I simply copied the buffer.
 Plus the reservoir sampling's complexity is another source of bugs...

 --
 Bela Ban
 Lead JGroups / Clustering Team
 JBoss
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102

2011-05-23 Thread Dan Berindei
On Mon, May 23, 2011 at 10:12 PM, Sanne Grinovero
sanne.grinov...@gmail.com wrote:
 2011/5/23 Dan Berindei dan.berin...@gmail.com:
 On Mon, May 23, 2011 at 7:44 PM, Sanne Grinovero
 sanne.grinov...@gmail.com wrote:
 To keep stuff simple, I'd add an alternative feature instead:
 have the custom externalizers to optionally recommend an allocation buffer 
 size.

 In my experience people use a set of well known types for the key, and
 maybe for the value as well, for which they actually know the output
 byte size, so there's no point in Infinispan to try guessing the size
 and then adapting on it; an exception being the often used Strings,
 even in composite keys, but again as user of the API I have a pretty
 good idea of the size I'm going to need, for each object I store.


 Excellent idea, if the custom externalizer can give us the exact size
 of the serialized object we wouldn't need to do any guesswork.
 I'm a bit worried about over-zealous externalizers that will spend
 just as much computing the size of a complex object graph as they
 spend on actually serializing the whole thing, but as long as our
 internal externalizers are good examples I think we're ok.

 I'd expect many to return a static constant, not to compute anything
 actually. Didn't prototype it, I'm just tossing some ideas so I might
 miss something.


I guess it depends on how we document it...

 Big plus: we could use the size of the serialized object to estimate
 the memory usage of each cache entry, so maybe with this we could
 finally constrain the cache to use a fixed amount of memory :)

 that's very interesting!

 Also in MarshalledValue I see that an ExposedByteArrayOutputStream is
 created, and after serialization if the buffer is found to be bigger
 than the buffer we're referencing a copy is made to create an exact
 matching byte[].
 What about revamping the interface there, to expose the
 ExposedByteArrayOutputStream instead of byte[], up to the JGroups
 level?

 In case the value is not stored in binary form, the expected life of
 the stream is very short anyway, after being pushed directly to
 network buffers we don't need it anymore... couldn't we pass the
 non-truncated stream directly to JGroups without this final size
 adjustement ?

 Of course when values are stored in binary form it might make sense to
 save some memory, but again if that was an option I'd make use of it;
 in case of Lucene I can guess the size with a very good estimate (some
 bytes off), compared to buffer sizes of potentially many megabytes
 which I'd prefer to avoid copying - especially not interested in it to
 safe 2 bytes even if I where to store values in binary.


 Yeah, but ExposedByteArrayOutputStream grows by 100% percent, so if
 you're off by 1 in your size estimate you'll waste 50% of the memory
 by keeping that buffer around.

 Not really, the current implementation has some smart logic in that
 area which reduces the potential waste to 20%;
 there's a max doubling threshold in bytes, we could set that to the
 expected value or work on the idea.


The max doubling threshold is now at 4MB, so that wouldn't kick in
in your case. But since we know we are are starting from a good enough
estimate I guess we could get rid of it and always grow the buffer by
25%.

 Anyway if I happen to be often off by one, I'd better add +1 to what
 my externalizer computes as estimate ;)


I was thinking more of what happens when the structure of the object
changes (e.g. a new field is added) and the externalizer is not
updated.



 Even if your estimate is perfect you're still wasting at least 32
 bytes on a 64-bit machine: 16 bytes for the buffer object header + 8
 for the array reference + 4 (rounded up to 8) for the count, though I
 guess you could get that down to 4 bytes by keeping the byte[] and
 count as members of MarshalledValue.

 this is true, but only if you have to keep the
 ExposedByteArrayOutputStream for longer after the RPC, otherwise the
 overhead
 is zero as it's being thrown away very quickly - exactly as is done now.

 Maybe we should write a couple of alternative implementations for
 MarshalledValue, depending on which options are enabled as it's
 getting complex.

 Besides, for Lucene couldn't you store the actual data separately as a
 byte[] so that Infinispan doesn't wrap it in a MarshalledValue?

 exactly as we do ;) so in the Lucene case these optimizations apply to
 the keys only, which are mostly tuples like long, int, String,
 considering that the strings contain very short filenames I'd love to:
 a) default to something smaller than the current 128B
 b) no need to make another copy if I've overestimated  by a couple of
 bytes, the buffer is very short lived anyway.


Hmm, if your keys are much shorter then I'm not sure eliminating the
arraycopy is going to help performance that much anyway.

 Cheers,
 Sanne


 Then if we just keep the ExposedByteArrayOutputStream around in the
 MarshalledValue, we could save some copying

Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102

2011-05-23 Thread Dan Berindei
On Mon, May 23, 2011 at 11:50 PM, Bela Ban b...@redhat.com wrote:


 On 5/23/11 6:44 PM, Sanne Grinovero wrote:
 To keep stuff simple, I'd add an alternative feature instead:
 have the custom externalizers to optionally recommend an allocation buffer 
 size.

 In my experience people use a set of well known types for the key, and
 maybe for the value as well, for which they actually know the output
 byte size, so there's no point in Infinispan to try guessing the size
 and then adapting on it; an exception being the often used Strings,
 even in composite keys, but again as user of the API I have a pretty
 good idea of the size I'm going to need, for each object I store.

 Also in MarshalledValue I see that an ExposedByteArrayOutputStream is
 created, and after serialization if the buffer is found to be bigger
 than the buffer we're referencing a copy is made to create an exact
 matching byte[].
 What about revamping the interface there, to expose the
 ExposedByteArrayOutputStream instead of byte[], up to the JGroups
 level?


 No need to expose the ExposedByteArrayOutputStream, a byte[] buffer,
 offset and length will do it, and we already use this today.


 In case the value is not stored in binary form, the expected life of
 the stream is very short anyway, after being pushed directly to
 network buffers we don't need it anymore... couldn't we pass the
 non-truncated stream directly to JGroups without this final size
 adjustement ?


The problem is that byte[] first has to be copied to another buffer
together with the rest of the ReplicableCommand before getting to
JGroups. AFAIK in JGroups you must have 1 buffer for each message.


 You do that, yes.

 However, afair, the issue is not on the *sending*, but on the
 *receiving* side. That's where the larger-than-needed buffer sticks
 around. On the sending side, as you mentioned, Infinispan passes a
 buffer/offset/length to JGroups and JGroups passes this right on to the
 network layer, which copies that data into a buffer.


I don't think so... on the receiving size the buffer size is
controlled exclusively by JGroups, the unmarshaller doesn't create any
buffers. The only buffers on the receiving side are those created by
JGroups, and JGroups knows the message size before creating the buffer
so it doesn't have to worry about predicting buffer sizes.

On sending however I understood that JGroups keeps the buffer with the
offset and length in the NakReceivingWindow exactly as it got it from
Infinispan, without any trimming, until it receives a STABLE message
from all the other nodes in the cluster.

 --
 Bela Ban
 Lead JGroups / Clustering Team
 JBoss
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102

2011-05-24 Thread Dan Berindei
On Tue, May 24, 2011 at 11:57 AM, Sanne Grinovero
sanne.grinov...@gmail.com wrote:
 2011/5/24 Galder Zamarreño gal...@redhat.com:
 Guys,

 Some interesting discussions here, keep them coming! Let me summarise what I 
 submitted yesterday as pull req for https://issues.jboss.org/browse/ISPN-1102

 - I don't think users can really provide such accurate predictions of the 
 objects sizes because first java does not give you an easy way of figuring 
 out how much your object takes up and most of the people don't have such 
 knowledge. What I think could be more interesting is potentially having a 
 buffer predictor that predicts sizes per type, so rather than calculate the 
 next buffer size taking all objects into account, do that per object type. 
 To enable to do this in the future, I'm gonna add the object to be 
 marshalled as parameter to 
 https://github.com/infinispan/infinispan/pull/338/files#diff-2 - This 
 enhancement allows for your suggestions on externalizers providing estimate 
 size to be implemented, but I'm not keen on that.

 - For a solution to ISPN-1102, I've gone for a simpler adaptive buffer size 
 algorithm that Netty uses for determining the receiver buffer size. The use 
 cases are different but I liked the simplicity of the algorithm since 
 calculating the next buffer size was an O(1) op and can grow both ways very 
 easily. I agree that it might not be as exact as reservoir 
 sampling+percentile, but at least it's cheaper to compute and it resolves 
 the immediate problem of senders keeping too much memory for sent buffers 
 before STABLE comes around.

 - Next step would be to go and test this and compare it with Bela/Dan were 
 seeing (+1 to another interactive debugging session), and if we are still 
 not too happy about the memory consumption, maybe we can look into providing 
 a different implementation for BufferSizePredictor that uses R sampling.

 - Finally, I think once ISPN-1102 is in, we should make the 
 BufferSizePredictor implementation configurable programmatically and via XML 
 - I'll create a separate JIRA for this.

 great wrap up, +1 on all points.
 BTW I definitely don't expect every user to be able to figure out the
 proper size, just that some of them might want (need?) to provide
 hints.


Looks great Galder, although I could use some comments on how the
possible buffer sizes are chosen in your algorithm :-)

I guess we were thinking of different things with the externalizer
extension. I was imagining something like an ObjectOutput
implementation that doesn't really write anything but instead it just
records the size of the object that would be written. That way the
size estimate would always be accurate, but of course the performance
wouldn't be very good for complex object graphs.

Still I'd like to play with something like this to see if we can
estimate the memory usage of the cache and base the eviction on the
(estimated) memory usage instead of a fixed number of entries, it
seems to me like that's the first question people ask when they start
using Infinispan.

Cheers
Dan


 Cheers,
 Sanne


 Cheers,

 On May 24, 2011, at 8:12 AM, Bela Ban wrote:



 On 5/23/11 11:09 PM, Dan Berindei wrote:

 No need to expose the ExposedByteArrayOutputStream, a byte[] buffer,
 offset and length will do it, and we already use this today.


 In case the value is not stored in binary form, the expected life of
 the stream is very short anyway, after being pushed directly to
 network buffers we don't need it anymore... couldn't we pass the
 non-truncated stream directly to JGroups without this final size
 adjustement ?


 The problem is that byte[] first has to be copied to another buffer
 together with the rest of the ReplicableCommand before getting to
 JGroups. AFAIK in JGroups you must have 1 buffer for each message.


 If you use ExposedByteArrayOutputStream, you should have access to the
 underlying buffer, so you don't need to copy it.


 You do that, yes.

 However, afair, the issue is not on the *sending*, but on the
 *receiving* side. That's where the larger-than-needed buffer sticks
 around. On the sending side, as you mentioned, Infinispan passes a
 buffer/offset/length to JGroups and JGroups passes this right on to the
 network layer, which copies that data into a buffer.


 I don't think so... on the receiving size the buffer size is
 controlled exclusively by JGroups, the unmarshaller doesn't create any
 buffers. The only buffers on the receiving side are those created by
 JGroups, and JGroups knows the message size before creating the buffer
 so it doesn't have to worry about predicting buffer sizes.

 On sending however I understood that JGroups keeps the buffer with the
 offset and length in the NakReceivingWindow exactly as it got it from
 Infinispan, without any trimming, until it receives a STABLE message
 from all the other nodes in the cluster.


 Ah, ok. I think we should really do what we said before JBW, namely have
 an interactive debugging

Re: [infinispan-dev] chunking ability on the JDBC cacheloader

2011-05-24 Thread Dan Berindei
On Tue, May 24, 2011 at 11:42 AM, Sanne Grinovero
sanne.grinov...@gmail.com wrote:
 2011/5/24 Dan Berindei dan.berin...@gmail.com:
 On Mon, May 23, 2011 at 8:05 PM, Sanne Grinovero
 sanne.grinov...@gmail.com wrote:
 2011/5/23 이희승 (Trustin Lee) trus...@gmail.com:
 On 05/23/2011 07:40 PM, Sanne Grinovero wrote:
 2011/5/23 Dan Berindeidan.berin...@gmail.com:
 On Mon, May 23, 2011 at 7:04 AM, 이희승 (Trustin Lee)trus...@gmail.com  
 wrote:
 On 05/20/2011 03:54 PM, Manik Surtani wrote:
 Is spanning rows the only real solution?  As you say it would mandate 
 using transactions to keep multiple rows coherent, and 'm not sure if 
 everyone would want to enable transactions for this.

 There are more hidden overheads.  To update a value, the cache store
 must determine how many chunks already exists in the cache store and
 selectively delete and update them.  To simply aggressively, we could
 delete all chunks and insert new chunks.  Both at the cost of great
 overhead.

 I see no alternative to delete all values for each key, as we don't
 know which part of the byte array is dirty;
 At which overhead are you referring? We would still store the same
 amount of data, slit or not split, but yes multiple statements might
 require clever batching.


 Even MySQL supports a blog up to 4GiB, so I think it's better update the
 schema?

 You mean by accommodating the column size only, or adding the chunk_id ?
 I'm just asking, but all of yours and Dan's feedback have already
 persuaded me that my initial idea of providing chunking should be
 avoided.

 I mean user's updating the column type of the schema.


 +1

 BLOBs are only stored in external storage if the actual data can't fit
 in a normal table row, so the only penalty in using a LONGBLOB
 compared to a VARBINARY(255) is 3 extra bytes for the length.

 If the user really wants to use a data type with a smaller max length,
 we can just report an error when the data column size is too small. We
 will need to check the length and throw an exception ourselves though,
 with MySQL we can't be sure that it is configured to raise errors when
 a value is truncated.

 +1
 it might be better to just check for the maximum size of stored values
 to fit in something; I'm not sure if we can guess the proper size
 from database metadata: not only the column maximum size is involved,
 but MySQL (to keep it as reference example, but might apply to others)
 also has a default maximum packet size for the connections which is
 not very big, when using it with Infinispan I always had to
 reconfigure the database server.

 Also as BLOBs are very poor as primary key, people might want to use a
 limited and well known byte size for their keys.

 So, shall we just add a method to check to not have surpassed a user
 defined threshold, checking for both key and value but on different
 configurable sizes? Should an exception be raised in that case?

 Exception will be raised by JDBC driver if key doesn't fit into the key
 column, so we could simply wrap it?

 If that always happens, the I wouldn't wrap it. entering the business
 of wrapping driver specific exceptions is very tricky ;)
 I was more concerned about the fact that some database might not raise
 any exception ? Not sure if that's the case, and possibly not our
 problem.


 By default MySQL only gives a warning if the value is truncated. We
 could throw an exception every time we got a warning from the DB, but
 the wrong value has already been inserted in the DB and if the key was
 truncated then we don't even have enough information to delete it.

 yes I had some bell ringing, it was MySQL then indeed.


 A better option to avoid checking ourselves may be to check on startup
 if STRICT_ALL_TABLES
 (http://dev.mysql.com/doc/refman/5.5/en/server-sql-mode.html#sqlmode_strict_all_tables)
 is enabled with SELECT @@SESSION.sql_mode in the MySQL implementation
 and refuse to start if it's not. There is another STRICT_TRANS_TABLES
 mode, but I don't know how to find out if a table is transactional or
 not...

 You check the transactional capabilities of a table by using SHOW
 CREATE TABLE and check which engine it's using.

 Still I don't think we should prevent people from shooting themselves
 in the foot, I'm not going to raise another exception
 if I don't detect they have a proper backup policy either, nor if
 they're using a JDBC driver having known bugs.

 I think that what people need from us is a way to understand the size
 of what they're going to store;
 logging it as we did for ISPN-1125 is a first step, and maybe it's enough?
 Maybe it would be useful to collect max sizes, and print them too
 regularly in the logs, or expose that through MBeans?


Ok, this probably won't be such a big problem with the redesigned JDBC
cache store, but with the actual design you can't put a new value
without first reading the old value, and you can't read the old value
because the data has been truncated, so the only way to get out of
this mess

Re: [infinispan-dev] 5.0.0.CR5

2011-06-08 Thread Dan Berindei
I've only got ISPN-1092 to review with Pete and some improvements
planned around ISPN-1000. Monday sounds good.

Dan


On Wed, Jun 8, 2011 at 9:08 AM, Mircea Markus mmar...@redhat.com wrote:

 I'll be mainly working on perf bechmark, so no deliverables from me.
 On 7 Jun 2011, at 17:19, Manik Surtani ma...@jboss.org wrote:

 Hi guys

 What do we have in store for CR5?  What's been committed since CR4, and 
 what's in the pipeline for CR5?  Does a release on Monday (13th) make sense?

 Cheers
 Manik
 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org




 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] frequent stack in testsuite

2011-06-09 Thread Dan Berindei
I don't think it's an externalizer issue, as I also see some
exceptions on the node that generates state:

2011-06-09 18:16:18,250 ERROR
[org.infinispan.remoting.transport.jgroups.JGroupsTransport]
(STREAMING_STATE_TRANSFER-sender-1,Infinispan-Cluster,NodeA-57902)
ISPN00095: Caught while responding to state transfer request
org.infinispan.statetransfer.StateTransferException:
java.util.concurrent.TimeoutException:
STREAMING_STATE_TRANSFER-sender-1,Infinispan-Cluster,NodeA-57902 could
not obtain exclusive processing lock after 10 seconds.  Locks in
question are 
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock@a35c90[Read
locks = 1] and 
java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock@111fb7f[Unlocked]
at 
org.infinispan.statetransfer.StateTransferManagerImpl.generateState(StateTransferManagerImpl.java:177)
at 
org.infinispan.remoting.InboundInvocationHandlerImpl.generateState(InboundInvocationHandlerImpl.java:248)
at 
org.infinispan.remoting.transport.jgroups.JGroupsTransport.getState(JGroupsTransport.java:585)
at 
org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.handleUpEvent(MessageDispatcher.java:690)
at 
org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:771)
at org.jgroups.JChannel.up(JChannel.java:1484)
at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1074)
at 
org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER$StateProviderHandler.process(STREAMING_STATE_TRANSFER.java:651)
at 
org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER$StateProviderThreadSpawner$1.run(STREAMING_STATE_TRANSFER.java:580)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: java.util.concurrent.TimeoutException:
STREAMING_STATE_TRANSFER-sender-1,Infinispan-Cluster,NodeA-57902 could
not obtain exclusive processing lock after 10 seconds.  Locks in
question are 
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock@a35c90[Read
locks = 1] and 
java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock@111fb7f[Unlocked]
at 
org.infinispan.remoting.transport.jgroups.JGroupsDistSync.acquireProcessingLock(JGroupsDistSync.java:100)
at 
org.infinispan.statetransfer.StateTransferManagerImpl.generateTransactionLog(StateTransferManagerImpl.java:204)
at 
org.infinispan.statetransfer.StateTransferManagerImpl.generateState(StateTransferManagerImpl.java:167)
... 11 more

I guess we could write an error marker in the stream to prevent the
EOFException on the receiving side, but the end result would be the
same.

Dan


On Thu, Jun 9, 2011 at 5:58 PM, Sanne Grinovero sa...@infinispan.org wrote:
 Hello all,
 if I happen to look at the console while the tests are running, I see
 this exception popup very often:

 2011-06-09 15:32:18,092 ERROR [JGroupsTransport]
 (Incoming-1,Infinispan-Cluster,NodeB-32230) ISPN00096: Caught while
 requesting or applying state
 org.infinispan.statetransfer.StateTransferException:
 java.io.EOFException: Read past end of file
        at 
 org.infinispan.statetransfer.StateTransferManagerImpl.applyState(StateTransferManagerImpl.java:333)
        at 
 org.infinispan.remoting.InboundInvocationHandlerImpl.applyState(InboundInvocationHandlerImpl.java:230)
        at 
 org.infinispan.remoting.transport.jgroups.JGroupsTransport.setState(JGroupsTransport.java:602)
        at 
 org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.handleUpEvent(MessageDispatcher.java:711)
        at 
 org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:771)
        at org.jgroups.JChannel.up(JChannel.java:1441)
        at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1074)
        at 
 org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.connectToStateProvider(STREAMING_STATE_TRANSFER.java:523)
        at 
 org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.handleStateRsp(STREAMING_STATE_TRANSFER.java:462)
        at 
 org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.up(STREAMING_STATE_TRANSFER.java:223)
        at org.jgroups.protocols.FRAG2.up(FRAG2.java:189)
        at org.jgroups.protocols.FC.up(FC.java:479)
        at org.jgroups.protocols.pbcast.GMS.up(GMS.java:891)
        at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:246)
        at org.jgroups.protocols.UNICAST.handleDataReceived(UNICAST.java:613)
        at org.jgroups.protocols.UNICAST.up(UNICAST.java:294)
        at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:703)
        at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:133)
        at org.jgroups.protocols.FD.up(FD.java:275)
        at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:275)
        at org.jgroups.protocols.MERGE2.up(MERGE2.java:209)
        at 

Re: [infinispan-dev] Issue posted for JBoss Logging and slf4j

2011-06-20 Thread Dan Berindei
Hi Jeff


On Mon, Jun 20, 2011 at 10:48 AM, Jeff Ramsdale jeff.ramsd...@gmail.com wrote:
 I've posted https://issues.jboss.org/browse/JBLOGGING-65 to address an
 issue I've run into converting to Infinispan 5.0.0.CR5. The issue
 concerns the switch to JBoss Logging and its failure to support
 log4j-over-slf4j, a jar provided by slf4j to adapt log4j events and
 route them to slf4j.

 JBoss Logging doesn't detect that org.apache.log4j.LogManager is
 provided by the log4j-over-slf4j jar and selects the
 Log4jLoggerProvider instead of the Slf4jLoggerProvider.

 I still haven't nailed down what's going on with logging in Infinispan
 5.0.0 but I'm not getting any logging output and have run into several
 issues. See https://issues.jboss.org/browse/ISPN-1177 as well (it
 seems Maven 3 is required to build Infinispan 5?). I have concerns
 that folks converting from 4.x will have issues as I have and wanted
 to raise a red flag before release. Anybody else seeing a degradation
 in logging behavior?


I'm building Infinispan with Maven 2.2.1 on Fedora 14 and it works fine.
I haven't noticed ISPN-1177 before, but I do get the error from time
to time when I build from IntelliJ, because the IntelliJ compiler
doesn't do annotation processing (or at least I haven't been able to
configure it properly).
Then when you run 'mvn install' maven sees that the class is up to
date and doesn't run annotation processing either. Running 'mvn clean
install' usually sorts everything out.

Dan


 -jeff
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Alert from a failing test

2011-06-21 Thread Dan Berindei
On Mon, Jun 20, 2011 at 11:42 PM, Sanne Grinovero sa...@infinispan.org wrote:
 2011/6/20 Manik Surtani ma...@jboss.org:
 Oddly enough, I don't see any other tests exhibiting this behaviour.  Let me 
 know if you see it in more recent CI runs, and we'll investigate in detail.

 In fact there are not many tests in core which verify a full stream is
 received; but as in another thread I mentioned I was seeing the
 following exception relatively often (it never caught my attention
 some months ago)

 Caused by: java.io.EOFException: The stream ended unexpectedly.
 Please check whether the source of the stream encountered any issues
 generating the stream.
        at 
 org.infinispan.marshall.VersionAwareMarshaller.objectFromObjectStream(VersionAwareMarshaller.java:193)
        at 
 org.infinispan.statetransfer.StateTransferManagerImpl.processCommitLog(StateTransferManagerImpl.java:218)
        at 
 org.infinispan.statetransfer.StateTransferManagerImpl.applyTransactionLog(StateTransferManagerImpl.java:245)
        at 
 org.infinispan.statetransfer.StateTransferManagerImpl.applyState(StateTransferManagerImpl.java:284)
        ... 27 more
 Caused by: java.io.EOFException: Read past end of file
        at 
 org.jboss.marshalling.SimpleDataInput.eofOnRead(SimpleDataInput.java:126)
        at 
 org.jboss.marshalling.SimpleDataInput.readUnsignedByteDirect(SimpleDataInput.java:263)
        at 
 org.jboss.marshalling.SimpleDataInput.readUnsignedByte(SimpleDataInput.java:224)
        at 
 org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209)
        at 
 org.jboss.marshalling.AbstractObjectInput.readObject(AbstractObjectInput.java:37)
        at 
 org.infinispan.marshall.jboss.GenericJBossMarshaller.objectFromObjectStream(GenericJBossMarshaller.java:191)
        at 
 org.infinispan.marshall.VersionAwareMarshaller.objectFromObjectStream(VersionAwareMarshaller.java:191)
        ... 30 more


The line at 
org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209)
suggests EOF has been reached while reading the lead byte of the
object, not a partial object. This is consistent with
StateTransferManagerImpl.generateTransactionLog() getting a timeout
while trying to acquire the processing lock (at
StateTransferManagerImpl.java:192) and closing the stream in the
middle of the transaction log, not a transmission error. We could
probably get rid of the exception in the logs by inserting another
delimiter here.

Back to the original problem, if this was a stream corruption issue
I'd expect a lot more instances of deserializing errors because the
length of the buffer was smaller/larger than the number of bytes
following it and the next object to be deserialized from the stream
found garbage instead.

This looks to me more like an index segment has been created with size
x on node A and also on node B, then it was updated with size y  x on
node A but only the metadata got to node B, the segment's byte array
remained the same.

I don't know anything about the Lucene directory implementation yet,
so I have no idea if/how this could happen and I haven't been able to
reproduce it on my machine. Is there a way to see the Jenkins test
logs?

Dan


 This looks like a suspicious correlation to me, as I think the
 reported errors are similar in nature.

 Cheers,
 Sanne




 On 18 Jun 2011, at 20:18, Sanne Grinovero wrote:

 Hello all,
 I'm not in state to fully debug the issue this week, but even though
 this failure happens in the Lucene Directory it looks like it's
 reporting an issue with Infinispan core:

 https://infinispan.ci.cloudbees.com/job/Infinispan-master-JDK6-tcp/90/org.infinispan$infinispan-lucene-directory/testReport/junit/org.infinispan.lucene/SimpleLuceneTest/org_infinispan_lucene_SimpleLuceneTest_testIndexWritingAndFinding/

 In this test we're writing to the index, and then asserting on the
 expected state on both nodes, but while it is successful on the same
 node as the writer, it fails with
 java.io.IOException: Read past EOF on the second node.

 This exception can mean only one thing: the value, which is a
 buffer[], was not completely transferred to the second node, which
 seems quite critical as the caches are using sync.
 I can't reproduce the error locally, but it's not the first time it is
 reported by CI: builds 60, 62, 65 for example (and more) show the same
 testcase fail in the same manner.

 Cheers,
 Sanne
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org




 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


 ___
 infinispan-dev mailing list
 

Re: [infinispan-dev] Atomic operations and transactions

2011-07-05 Thread Dan Berindei
On Tue, Jul 5, 2011 at 12:23 PM, Galder Zamarreño gal...@redhat.com wrote:


 On Jul 4, 2011, at 11:25 AM, Sanne Grinovero wrote:

 I agree they don't make sense, but only in the sense of exposed API
 during a transaction: some time ago I admit I was expecting them to
 just work: the API is there, nice public methods in the public
 interface with javadocs explaining that that was exactly what I was
 looking for, no warnings, no failures. Even worse, all works fine when
 running a local test because how the locks currently work they are
 acquired locally first, so unless you're running such a test in DIST
 mode, and happen to be *not* the owner of the being tested key, people
 won't even notice that this is not supported.

 Still being able to use them is very important, also in combination
 with transactions: I might be running blocks of transactional code
 (like a CRUD operation via OGM) and still require to advance a
 sequence for primary key generation. This needs to be an atomic
 operation, and I should really not forget to suspend the transaction.

 Fair point. At first glance, the best way to deal with this is suspending the 
 tx cos that guarantees the API contract while not forcing locks to be 
 acquired for too long.

 I'd advice though that whoever works on this though needs to go over existing 
 use cases and see if the end result could differ somehow if this change gets 
 applied. If any divergences are found and are to be expected, these need to 
 be thoroughly documented.

 I've gone through some cases and end results would not differ at first glance 
 if the atomic ops suspend the txs. The only thing that would change would be 
 the expectations of lock acquisition timeouts by atomic ops within txs.

 For example:

 Cache contains: k1=galder

 1. Tx1 does a cache.replace(k1, galder, sanne) - suspends tx and applies 
 change - k1=sanne now
 2. Tx2 does a cache.replace(k1, galder, manik) - suspends tx and is not 
 able to apply change
 3. Tx2 commits
 4. Tx1 commits
 End result: k1=sanne

 1. Tx1 does a cache.replace(k1, galder, sanne) - acquires lock
 2. Tx2 does a cache.replace(k1, galder, manik) - waits for lock
 3. Tx2 rollback - times out acquiring lock
 4. Tx1 commits - applies change
 End result: k1=sanne


Galder, the big difference would be that with optimistic transactions
you don't acquire the lock on the spot, so the tx will be rolled back
if someone else modifies the key between your get and the prepare.
We could make a compromise, instead of forcing the atomic operations
to happen outside the transaction we could force them to always use
pessimistic locking. Actually Mircea and I discussed this Friday
evening too, but I forgot about it until now.

After all Sanne has two use cases for atomic operations: sequences and
reference counts. Sequences can and should happen outside
transactions, but as we discussed on the train we could make each
node/thread acquire a range of say 100 seq numbers at a time and
remove the need for any locking to get a new sequence number.
Reference counts on the other hand should remain inside the
transaction, because you would have to to the refcount rollback by
hand (plus I don't think you should let other transactions see the
modified refcount before they see the new data).

Dan


 Sanne

 2011/7/4 Galder Zamarreño gal...@redhat.com:
 Do these atomic operations really make sense within an (optimitic) 
 transaction?

 For example, putIfAbsent(): it stores a k,v pair if the key is not present. 
 But the key about it's usability is that the return of putIfAbsent can tell 
 you whether the put succeeded or not.

 Once you go into transactions, the result is only valid once the 
 transaction has been prepared unless the pessimistic locking as per 
 definition in http://community.jboss.org/docs/DOC-16973 is in use, and 
 that's already pretty confusing IMO.

 I get the feeling that those atomic operations are particularly useful when 
 transactions are not used cos they allow you to reduce to cache operations 
 to one, hence avoiding the need to use a lock or synchronized block, or in 
 our case, a transaction.

 On Jun 30, 2011, at 3:11 PM, Sanne Grinovero wrote:

 Hello all,
 some team members had a meeting yesterday, one of the discussed
 subjects was about using atomic operations (putIfAbsent, etc..).
 Mircea just summarised it in the following proposal:

 The atomic operations, as defined by the ConcurrentHashMap, don't fit
 well within the scope of optimistic transaction: this is because there
 is a discrepancy between the value returned by the operation and the
 value and the fact that the operation is applied or not:
 E.g. putIfAbsent(k, v) might return true as there's no entry for k in
 the scope of the current transaction, but in fact there might be a
 value committed by another transaction, hidden by the fact we're
 running in repeatable read mode.
 Later on, at prepare time when the same operation is applied on the
 node that actually holds k, 

Re: [infinispan-dev] Atomic operations and transactions

2011-07-05 Thread Dan Berindei
On Tue, Jul 5, 2011 at 12:49 PM, Sanne Grinovero sa...@infinispan.org wrote:

 2011/7/5 Dan Berindei dan.berin...@gmail.com:

 After all Sanne has two use cases for atomic operations: sequences and
 reference counts. Sequences can and should happen outside
 transactions, but as we discussed on the train we could make each
 node/thread acquire a range of say 100 seq numbers at a time and
 remove the need for any locking to get a new sequence number.
 Reference counts on the other hand should remain inside the
 transaction, because you would have to to the refcount rollback by
 hand (plus I don't think you should let other transactions see the
 modified refcount before they see the new data).

 To clarify my use case, I'm never going to need refcounts together to
 a running transaction. The refcount example was about why I need
 atomic counters, but this thread is about the dangers of exposing
 these atomic operations API which are broken when using transactions.


O, I understand refcounts are not really the topic you wanted to talk
about, but I'd really like to know why you don't need transactions
with refcounts. I was assuming the refcount is a link from a key A in
the cache to another key B, so adding another link would involve
incrementing the counter as well as adding the key A.

Or do you mean that refcount increment/decrement should be a special
operation that never fails, so it could be done after the commit of
the tx that added key A?
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Atomic operations and transactions

2011-07-05 Thread Dan Berindei
On Tue, Jul 5, 2011 at 12:46 PM, Sanne Grinovero sa...@infinispan.org wrote:
 2011/7/5 Galder Zamarreño gal...@redhat.com:


 On Jul 4, 2011, at 11:25 AM, Sanne Grinovero wrote:

 I agree they don't make sense, but only in the sense of exposed API
 during a transaction: some time ago I admit I was expecting them to
 just work: the API is there, nice public methods in the public
 interface with javadocs explaining that that was exactly what I was
 looking for, no warnings, no failures. Even worse, all works fine when
 running a local test because how the locks currently work they are
 acquired locally first, so unless you're running such a test in DIST
 mode, and happen to be *not* the owner of the being tested key, people
 won't even notice that this is not supported.

 Still being able to use them is very important, also in combination
 with transactions: I might be running blocks of transactional code
 (like a CRUD operation via OGM) and still require to advance a
 sequence for primary key generation. This needs to be an atomic
 operation, and I should really not forget to suspend the transaction.

 Fair point. At first glance, the best way to deal with this is suspending 
 the tx cos that guarantees the API contract while not forcing locks to be 
 acquired for too long.

 I'd advice though that whoever works on this though needs to go over 
 existing use cases and see if the end result could differ somehow if this 
 change gets applied. If any divergences are found and are to be expected, 
 these need to be thoroughly documented.

 I've gone through some cases and end results would not differ at first 
 glance if the atomic ops suspend the txs. The only thing that would change 
 would be the expectations of lock acquisition timeouts by atomic ops within 
 txs.

 For example:

 Cache contains: k1=galder

 1. Tx1 does a cache.replace(k1, galder, sanne) - suspends tx and 
 applies change - k1=sanne now
 2. Tx2 does a cache.replace(k1, galder, manik) - suspends tx and is not 
 able to apply change
 3. Tx2 commits
 4. Tx1 commits
 End result: k1=sanne

 Right.
 To clarify, this is what would happen with the current implementation:

 1. Tx2 does a cache.get(k1) - it reads the value of k1, and is
 returned galder
 2. Tx1 does a cache.replace(k1, galder, sanne) - k1=sanne in
 the scope of this transaction, but not seen by other tx
 3. Tx2 does a cache.replace(k1, galder, manik) - k1=manik is
 assigned, as because of repeatable read we're still seeing galder
 4. Tx2   Tx1 commit

 ..and the end result depends on who commits first.


 1. Tx1 does a cache.replace(k1, galder, sanne) - acquires lock
 2. Tx2 does a cache.replace(k1, galder, manik) - waits for lock
 3. Tx2 rollback - times out acquiring lock
 4. Tx1 commits - applies change
 End result: k1=sanne

 I'm not sure we're on the same line here. 1) should apply the
 operation right away, so even if it might very briefly have to acquire
 a lock on it, it's immediately released (not at the end of the
 transaction), so why would TX2 have to wait for it to the point it
 needs to rollback?


I think it would make sense to make atomic operations pessimistic by
default, so they would behave like in Galder's example.

Then if you wanted to reduce contention you could suspend/resume the
transaction around your atomic operations and make them behave like
you're expecting them to.

Here is a contrived example:

1. Start tx Tx1
2. cache.get(k) - v0
3. cache.replace(k, v0, v1)
4. gache.get(k) - ??

With repeatable read and suspend/resume around atomic operations, I
believe operation 4 would return v0, and that would be very
surprising for a new user.
So I'd rather require explicit suspend/resume calls to make sure
anyone who uses atomic operations in a transaction understands what
results he's going to get.

Dan

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Atomic operations and transactions

2011-07-05 Thread Dan Berindei
On Tue, Jul 5, 2011 at 1:45 PM, Sanne Grinovero sa...@infinispan.org wrote:
 That's an interesting case, but I wasn't thinking about that. So it
 might become useful later on.
 The refcount scenario I'd like to improve first is about garbage
 collection of old unused index segments,
 we're counting references from open searchers and allow to finish to
 run queries on still open segments
 while they are being deleted: at the same time, these delete
 operations are run in batch operations in background threads,
 I couldn't possibly run them in a transaction as it would likely not
 have enough memory to complete it, and anyway they're run async
 to the rest of the application. So timing is not very critical, but
 having a wrong increment/decrement on the counter can cause
 many issues and so this must rely on atomic operations. Current
 implementation acquires a pessimistic lock on the integer,
 having a distributed AtomicInteger as discusses on the train would
 be a simple improvement.


I think I got it now, your references are from application objects to
cache keys, so you don't have anything else that you need to modify in
the same transaction.
You only have reads, unless the refcount reaches 0, in which case you
also do a remove.

It's clear to me now that the distributed AtomicInteger approach would
be best in your case, so that means I haven't found a real-life
problem that requires in-transaction atomic operations yet ;-)

Dan


 Sanne


 Dan

 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Atomic operations and transactions

2011-07-05 Thread Dan Berindei
On Tue, Jul 5, 2011 at 1:39 PM, Sanne Grinovero sa...@infinispan.org wrote:
 2011/7/5 Dan Berindei dan.berin...@gmail.com:
 Here is a contrived example:

 1. Start tx Tx1
 2. cache.get(k) - v0
 3. cache.replace(k, v0, v1)
 4. gache.get(k) - ??

 With repeatable read and suspend/resume around atomic operations, I
 believe operation 4 would return v0, and that would be very
 surprising for a new user.
 So I'd rather require explicit suspend/resume calls to make sure
 anyone who uses atomic operations in a transaction understands what
 results he's going to get.

 The problem is that as a use case it makes no sense to use an atomic
 operation without evaluating the return value.
 so 3) should actually read like

 3. boolean done = cache.replace(k, v0, v1)
 and based on this value, the application would branch in some way, and
 so acquiring locks and waiting for each other is not enough, we can
 only support this if write skew checks are enabled, and mandate the
 full operation to rollback in the end. That might be one option, but I
 really don't like to make it likely to rollback transactions, I'd
 prefer to have an alternative like a new flag which enforces a fresh
 read skipping the repeatable read guarantees. Of course this wouldn't
 work if we're not actually sending the operations to the key owners,
 so suspending the transaction is a much nicer approach from the user
 perspective. Though I agree this behaviour should be selectable.


Ok, I'm slowly remembering your arguments... do you think the fresh
read flag should be available for all operations, or does it make
sense to make it an internal flag that only the atomic operations will
use?

To summarize, with this example:
1. Start tx
2. cache.get(k) - v0
3. cache.replace(k, v0, v1)
4. gache.get(k) - ??
5. Commit tx

The options we could support are:
a. Tx suspend: no retries, but also no rollback for replace() and 4)
will not see the updated value
b. Optimistic locking + write skew check: if the key is modified by
another tx between 2) and 5), the entire transaction has to be redone
c. Optimistic locking + write skew check + fresh read: we only have to
redo the tx if the key is modified by another tx between 3) and 5)
d. Pessimistic locking: if the key is modified between 2) and 5), the
entire transaction has to be redone
e. Pessimistic locking + fresh read: no redo, but decreased throughput
because we hold the lock between 3) and 5)

I guess there is no reason to support option d), as we're making an
RPC to the owner in order to get the lock anyway. I think I'm leaning
towards supporting only a) and e), but there might be cases where b)
and c) would perform better.

Dan
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Atomic operations and transactions

2011-07-05 Thread Dan Berindei
On Tue, Jul 5, 2011 at 4:04 PM, Sanne Grinovero sa...@infinispan.org wrote:
 2011/7/5 Dan Berindei dan.berin...@gmail.com:
 On Tue, Jul 5, 2011 at 1:39 PM, Sanne Grinovero sa...@infinispan.org wrote:
 2011/7/5 Dan Berindei dan.berin...@gmail.com:
 Here is a contrived example:

 1. Start tx Tx1
 2. cache.get(k) - v0
 3. cache.replace(k, v0, v1)
 4. gache.get(k) - ??

 With repeatable read and suspend/resume around atomic operations, I
 believe operation 4 would return v0, and that would be very
 surprising for a new user.
 So I'd rather require explicit suspend/resume calls to make sure
 anyone who uses atomic operations in a transaction understands what
 results he's going to get.

 The problem is that as a use case it makes no sense to use an atomic
 operation without evaluating the return value.
 so 3) should actually read like

 3. boolean done = cache.replace(k, v0, v1)
 and based on this value, the application would branch in some way, and
 so acquiring locks and waiting for each other is not enough, we can
 only support this if write skew checks are enabled, and mandate the
 full operation to rollback in the end. That might be one option, but I
 really don't like to make it likely to rollback transactions, I'd
 prefer to have an alternative like a new flag which enforces a fresh
 read skipping the repeatable read guarantees. Of course this wouldn't
 work if we're not actually sending the operations to the key owners,
 so suspending the transaction is a much nicer approach from the user
 perspective. Though I agree this behaviour should be selectable.


 Ok, I'm slowly remembering your arguments... do you think the fresh
 read flag should be available for all operations, or does it make
 sense to make it an internal flag that only the atomic operations will
 use?

 To summarize, with this example:
 1. Start tx
 2. cache.get(k) - v0
 3. cache.replace(k, v0, v1)
 4. gache.get(k) - ??
 5. Commit tx

 The options we could support are:
 a. Tx suspend: no retries, but also no rollback for replace() and 4)
 will not see the updated value

 might work, but looks like a crazy user experience.


We could support it by letting the user suspend/resume the tx
manually. Then the only people experiencing this would be the ones who
explicitly requested it.

 b. Optimistic locking + write skew check: if the key is modified by
 another tx between 2) and 5), the entire transaction has to be redone

 might work as well, since people opted in for optimistic they should
 be prepared to experience failures.
 I'm not sure what the + stands for, how can you have optimistic
 locking without write skew checks?


I was never sure in what situations we do the write skew check, so I
thought I'd mention it to be clear.
Certainly atomic operations would never work without write skew
checks, but I don't think they're required for regular writes.

 c. Optimistic locking + write skew check + fresh read: we only have to
 redo the tx if the key is modified by another tx between 3) and 5)

 in this case we're breaking the repeatable read guarantees, so we
 should clarify this very well.


Yes, it would definitely need to be documented that atomic operations
always use read_committed if we go this way.

 d. Pessimistic locking: if the key is modified between 2) and 5), the
 entire transaction has to be redone

 I don't understand what's pessimistic about this? To be pessimistic it
 would attempt to guarantee success by locking at 2): during the get
 operation, before returning the value.
 Also if they key is modified implies write skew checks, so how would
 this be different than previous proposals?
 Generally as a user if I'm opting in for a pessimistic lock the only
 exception I'm prepared to handle is a timeout, definitely not a try
 again, the values changed.


Sorry, I meant if the key is modified between 2) and 3), since 3)
would not read the key again we'd have to check the value in the
prepare phase.
But you're right that it doesn't really make sense, since we're going
to the owner to get the lock anyway we might as well check the value
in one go.

 e. Pessimistic locking + fresh read: no redo, but decreased throughput
 because we hold the lock between 3) and 5)

 I assume you really mean to do explicit pessimistic locking:
 1. Start tx
 2. cache.lock(k);
 3. cache.get(k) - v0
 4. cache.replace(k, v0, v1) /// - throw an exception if we're
 not owning the lock
 5. gache.get(k) - ??
 6. Commit tx


No, I meant the replace call would do the locking itself and hold the
lock until the end of the tx, just like a regular put.

This option would be very similar to what we currently have, since any
write command already acquires a lock on the key. The only difference
would be that value check would be done on the main data owner, with
the current value of the key instead of the value in the invocation
context.

One big gotcha is that we will only update the value in the invocation
context if the replace succeeds, so the user will have

Re: [infinispan-dev] Faster LRU

2011-07-05 Thread Dan Berindei
On Tue, Jul 5, 2011 at 7:23 PM, Vladimir Blagojevic vblag...@redhat.com wrote:
 Hey guys,

 In the past few days I've look around how to squeeze every bit of
 performance out of BCHM and particularly our LRU impl. What I did not
 like about current LRU is that a search for an element in the queue is
 not constant time operation but requires full queue traversal if we need
 to find an element[1].

 It would be be particularly nice to have a hashmap with a constant cost
 for look up operations. Something like LinkedHashMap. LinkedHashMap
 seems to be a good container for LRU as it provides constant time lookup
 but also a hook, a callback call for eviction of the oldest entry in the
 form of removeEldestEntry callback. So why not implement our segment
 eviction policy by using a LinkedHashMap [2]?


+1 Vladimir, I had a similar idea to implement a BCHM segment entirely
using a single LinkedHashMap - obviously that requires a lot more work
to integrate with the existing BCHM then using a LHM inside the
eviction policy, but it should also be more memory efficient.

 I've seen about 50% performance increase for smaller caches (100K) and
 even more for larger and more contended caches - about 75% increase.
 After this change BCHM performance was not that much worse than CHM and
 it was faster than synchronized hashmap.

 Should we include this impl as FAST_LRU as I would not want to remove
 current LRU just yet? We have to prove this one is correct and that it
 does not have have any unforeseen issues.


I would definitely remove the old LRU, it's much harder to understand
because of the batching and the JDK already has tests to guarantee
that LinkedHashMap is correct.

One idea for testing I was discussing with Galder on my pull request's
page was to simulate a real cache workload, with get misses triggering
a put and also a small delay, in order to evaluate how good an
eviction policy is at keeping the most used keys in.

We definitely need to *some* testing for this, we can't afford to wait
for another community member to come in a year's time and prove to us
that we're rubbish :)

Dan
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Faster LRU

2011-07-05 Thread Dan Berindei
On Tue, Jul 5, 2011 at 7:47 PM, Alex Kluge java_kl...@yahoo.com wrote:

 Hi,

   I have a completely in-line limit on the cache entries built on a clock
 cache that is an approximation to an LRU cache, and is extremely fast
 (O(1)). It is not a strick LRU, but chooses a not recently used item for
 removal. I'll provide some more details soon.

   I'm not sure how far along you are, as some of this is in the future
 tense, and some in the past. But I'll dig up this code - it's been a while
 since I worked on the cache. :)


This sounds great Alex. We're not strictly LRU either because the policy is
enforced at the map segment level, so we don't always evict the oldest
element in the map.

Maybe you also have a good cache/eviction policy test? ;-)

Cheers
Dan
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Faster LRU

2011-07-07 Thread Dan Berindei
On Thu, Jul 7, 2011 at 1:21 PM, Manik Surtani ma...@jboss.org wrote:
 I think we leave the old LRU as LRU_OLD and mark it as deprecated.

I for one am against keeping the old policy around, as the new LRU
policy implements exactly the same algorithm only in O(1) instead of
O(n).
It would have made sense to keep it if there was a difference in the
algorithm, but Vladimir even kept the batching from the old LRU
policy.

We should test it as thoroughly as we can, and to that end I've been
working on some additions to MapStressTest that try to measure how
good the eviction algorithm is. I think it already helped me find a
problem with the new LRU.

I've updated pull #414
(https://github.com/infinispan/infinispan/pull/414) to work on top of
Vladimir's pull request, in case you want to have a look. You might
want to adjust the number of keys and/or disable some of the options
in the data providers before running it though, it takes a lot of time
to run (and it also needs -Xmx2000m).

I've left it running overnight on the test cluster (cluster01 and
cluster10), I'll send an update with the results in the morning.

Cheers
Dan
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Faster LRU

2011-07-08 Thread Dan Berindei
On Fri, Jul 8, 2011 at 2:53 AM, Dan Berindei dan.berin...@gmail.com wrote:
 I've updated pull #414
 (https://github.com/infinispan/infinispan/pull/414) to work on top of
 Vladimir's pull request, in case you want to have a look. You might
 want to adjust the number of keys and/or disable some of the options
 in the data providers before running it though, it takes a lot of time
 to run (and it also needs -Xmx2000m).

 I've left it running overnight on the test cluster (cluster01 and
 cluster10), I'll send an update with the results in the morning.



Morning update:
Ok, apparently -Xmx2000m wasn't enough for 2 million keys, so I had to
start the tests again in the morning, running each scenario on a
different machine.

I haven't run the tests with concurrency level 512, as the total
number of threads is only 100, but I suspect the old LRU still won't
catch up with the new LRU's performance.

It's interesting that in the writeOnMiss test the new LRU performance
dropped when I increased the concurrency level from 32 to 128. I think
it might be because the eviction.thresholdExpired() check in
BCHM.attemptEviction() is done without a lock and so it could return
true simultaneously for multiple threads - which will all proceed to
wait on the segment lock and attempt eviction at the same time.

Another strange pattern is that neither eviction policy respects the
capacity parameter exactly. LIRS rounds up the capacity to the next
power of 2, and LRU/LRUOld do the same rounding and then multiply by
0.75.

I'll report again once I fixed these and once I update the reporting -
I think the total number of misses might be more relevant than the
standard deviation of the keys at the end.

Cheers
Dan
Testing independent read/write/remove performance with 50 max elements, 
100 keys, 32 concurrency level, 90 readers, 9 writers, 1 removers
Container   BCHM:LIRS   Average get ops/s 17692.17  Average 
put ops/s  3871.06  Average remove ops/s   2043.00  Size
515661  stdDev   175722.99
Container   BCHM:LRUAverage get ops/s 36023.40  Average 
put ops/s  1084.62  Average remove ops/s   1131.47  Size
393216  stdDev   145373.51
Container   BCHM:LRU_OLDAverage get ops/s   849.86  Average 
put ops/s   273.88  Average remove ops/s244.41  Size
398940  stdDev   154739.53
Container   CLHMAverage get ops/s 48761.40  Average put 
ops/s 47008.30  Average remove ops/s  49627.01  Size
50  stdDev   187244.21
Container   SLHMAverage get ops/s 16866.18  Average put 
ops/s 23715.77  Average remove ops/s  15630.28  Size
50  stdDev   178134.07
[testng-MapStressTest] Test 
testReadWriteRemove(org.infinispan.stress.MapStressTest) succeeded.
Test suite progress: tests succeeded: 1, failed: 0, skipped: 0.
Testing independent read/write/remove performance with 50 max elements, 
100 keys, 128 concurrency level, 90 readers, 9 writers, 1 removers
Container   BCHM:LIRS   Average get ops/s 18868.45  Average 
put ops/s  6599.76  Average remove ops/s   3316.55  Size
521216  stdDev   175810.97
Container   BCHM:LRUAverage get ops/s 32797.94  Average 
put ops/s  7976.01  Average remove ops/s   8463.70  Size
393113  stdDev   153926.63
Container   BCHM:LRU_OLDAverage get ops/s  4093.76  Average 
put ops/s  1675.52  Average remove ops/s   1335.27  Size
389083  stdDev   152948.92
Container   CLHMAverage get ops/s 68179.61  Average put 
ops/s 64678.23  Average remove ops/s  70212.72  Size
499968  stdDev   181346.12
Container   SLHMAverage get ops/s 17893.75  Average put 
ops/s 24506.01  Average remove ops/s  14617.09  Size
491341  stdDev   181870.67
[testng-MapStressTest] Test 
testReadWriteRemove(org.infinispan.stress.MapStressTest) succeeded.
Test suite progress: tests succeeded: 2, failed: 0, skipped: 0.


Testing combined read/write performance with 50 max elements, 100 keys, 
32 concurrency level, 100 threads, 9:1 read:write ratio
Container   BCHM:LIRS   Average ops/s 12903.45  Size
524258  stdDev   170614.40
Container   BCHM:LRUAverage ops/s 25041.20  Size
393216  stdDev   154386.88
Container   BCHM:LRU_OLDAverage ops/s   545.07  Size
393243  stdDev   156098.32
Container   CLHMAverage ops/s 45277.37  Size50  
stdDev   169786.71
Container   SLHMAverage ops/s 17522.45  Size50  
stdDev   171034.64
[testng-MapStressTest] Test testReadWrite

Re: [infinispan-dev] Faster LRU

2011-07-11 Thread Dan Berindei
On Fri, Jul 8, 2011 at 12:44 PM, Dan Berindei dan.berin...@gmail.com wrote:

 It's interesting that in the writeOnMiss test the new LRU performance
 dropped when I increased the concurrency level from 32 to 128. I think
 it might be because the eviction.thresholdExpired() check in
 BCHM.attemptEviction() is done without a lock and so it could return
 true simultaneously for multiple threads - which will all proceed to
 wait on the segment lock and attempt eviction at the same time.


I added another batch threshold check while holding the lock and the
performance anomaly disappeared.

 Another strange pattern is that neither eviction policy respects the
 capacity parameter exactly. LIRS rounds up the capacity to the next
 power of 2, and LRU/LRUOld do the same rounding and then multiply by
 0.75.


I updated BCHM to pass the initial capacity to the eviction policy,
and now all the policies keep the same number of entries.

 I'll report again once I fixed these and once I update the reporting -
 I think the total number of misses might be more relevant than the
 standard deviation of the keys at the end.


I got the tests running on the test cluster again and the new LRU is
clearly better in every respect than the old LRU. In fact the only
problem in the test results is that in the new writeOnMiss
scenario the hit ratio of the new LRU is much better than all the
other policies. There is probably a mistake somewhere in the test, if
it was a random thing I don't think it would have been so visible in a
20-minutes test:

MapStressTest configuration: capacity 50, test running time 1200 seconds
Container BCHM:LIRS Ops/s   11155.49  HitRatio  96.54  Size
 499968  stdDev  193558.30
Container BCHM:LRU  Ops/s   31292.06  HitRatio  97.84  Size
 50  stdDev  193168.07
Container BCHM:LRU_OLD  Ops/s 116.89  HitRatio  76.11  Size
 500032  stdDev  197974.87

Testing write on miss performance with capacity 50, keys 200,
concurrency level 32, threads 100
Container BCHM:LIRS Ops/s1684.01  HitRatio  63.13  Size
 499968  stdDev  338637.40
Container BCHM:LRU  Ops/s4884.57  HitRatio  84.47  Size
 50  stdDev  353336.31
Container BCHM:LRU_OLD  Ops/s  50.69  HitRatio  41.34  Size
 500032  stdDev  361239.68
MapStressTest configuration: capacity 50, test running time 1200 seconds
Testing independent read/write/remove performance with capacity 50, keys 
100, concurrency level 32, readers 90, writers 9, removers 1
Container BCHM:LIRS Ops/s   14865.45  Gets/s   13971.36  Puts/s8424.86  
Removes/s4640.39  HitRatio  83.10  Size 499972  StdDev  168386.87
Container BCHM:LRU  Ops/s   35970.38  Gets/s   33800.61  Puts/s   19273.89  
Removes/s   21815.97  HitRatio  78.64  Size 50  StdDev  192598.94
Container BCHM:LRU_OLD  Ops/s 496.10  Gets/s 467.32  Puts/s 258.91  
Removes/s 269.63  HitRatio  69.66  Size 500032  StdDev  182468.36
Container CLHM  Ops/s   56074.13  Gets/s   50656.76  Puts/s   48274.11  
Removes/s   53090.65  HitRatio  77.62  Size 50  StdDev  180395.29
Container SLHM  Ops/s   20603.49  Gets/s   18110.83  Puts/s   22696.60  
Removes/s   20080.24  HitRatio  80.12  Size 499401  StdDev  179188.12
[testng-MapStressTest] Test 
testReadWriteRemove(org.infinispan.stress.MapStressTest) succeeded.
Test suite progress: tests succeeded: 1, failed: 0, skipped: 0.
Testing independent read/write/remove performance with capacity 50, keys 
100, concurrency level 128, readers 90, writers 9, removers 1
Container BCHM:LIRS Ops/s   23567.68  Gets/s   21558.38  Puts/s   19691.52  
Removes/s3617.48  HitRatio  84.73  Size 498450  StdDev  167244.66
Container BCHM:LRU  Ops/s   38313.89  Gets/s   35514.87  Puts/s   24921.89  
Removes/s   27617.36  HitRatio  77.67  Size 500096  StdDev  193880.45
Container BCHM:LRU_OLD  Ops/s2717.51  Gets/s2600.66  Puts/s1035.20  
Removes/s1197.20  HitRatio  79.14  Size 499573  StdDev  180107.36
Container CLHM  Ops/s   77609.22  Gets/s   70243.55  Puts/s   65605.74  
Removes/s   72464.47  HitRatio  78.64  Size 499968  StdDev  182996.20
Container SLHM  Ops/s   18812.45  Gets/s   16125.15  Puts/s   25319.02  
Removes/s   14003.08  HitRatio  81.44  Size 496624  StdDev  183041.62
[testng-MapStressTest] Test 
testReadWriteRemove(org.infinispan.stress.MapStressTest) succeeded.
Test suite progress: tests succeeded: 2, failed: 0, skipped: 0.
Testing independent read/write/remove performance with capacity 50, keys 
200, concurrency level 32, readers 90, writers 9, removers 1
Container BCHM:LIRS Ops/s   10579.80  Gets/s   10346.59  Puts/s2109.26  
Removes/s2002.00  HitRatio  50.88  Size 499969  StdDev  245820.97
Container BCHM:LRU  Ops/s   35180.30  Gets/s   34372.36  Puts/s6871.62  
Removes/s   10867.85  HitRatio  50.08

Re: [infinispan-dev] Faster LRU

2011-07-11 Thread Dan Berindei
On Mon, Jul 11, 2011 at 7:29 PM, Vladimir Blagojevic
vblag...@redhat.com wrote:
 On 11-07-08 5:44 AM, Dan Berindei wrote:

 I haven't run the tests with concurrency level 512, as the total
 number of threads is only 100, but I suspect the old LRU still won't
 catch up with the new LRU's performance.

 It's interesting that in the writeOnMiss test the new LRU performance
 dropped when I increased the concurrency level from 32 to 128. I think
 it might be because the eviction.thresholdExpired() check in
 BCHM.attemptEviction() is done without a lock and so it could return
 true simultaneously for multiple threads - which will all proceed to
 wait on the segment lock and attempt eviction at the same time.


 I am not sure about this Dan. I looked at this code for hours! I do not see
 how two threads can call eviction#execute() concurrently.


Sorry I wasn't very clear, two threads can enter attemptEviction
simultaneously, one will get the lock and perform eviction, the other
will also try to get the lock and when it gets it it will proceed to
call eviction#execute() again.
So eviction#execute() is not called concurrently, but it is called
twice when it should have been called only once, and I think this
dilutes the advantages of batching.
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] IllegalStateException when receiving invalidation messages on a stopped/stopping cache

2011-07-14 Thread Dan Berindei
TBH I'm not sure why the patch helped with the error messages in the
log, I always thought that doing this would just replace the
IllegalStateException with an InterruptedException (since do some more
stuff after invalidation).

Of course the best solution would be to not do rehashing at all when
we know we're going to stop the entire cluster anyway
(https://issues.jboss.org/browse/ISPN-1239).

Dan


On Thu, Jul 14, 2011 at 7:48 PM, Manik Surtani ma...@jboss.org wrote:
 Patch looks good.  Go ahead and issue a pull req.

 On 14 Jul 2011, at 16:26, Sanne Grinovero wrote:

 Hello,
 I was looking into the many stacktraces being thrown from the
 testsuite, this one caught my attention (see below).
 Can't we just discard such errors? if the cache is stopping, or
 stopped already, we shouldn't really care for invalidations -
 especially stacktraces and exceptions being returned to the caller.
 This doesn't solve all the EOF exceptions I'm still experiencing, but
 it seems to make things a bit better.. I've hacked a solution which
 implies adding a method:

 boolean ignoreCommandOnStatus(ComponentStatus status);

 to the VisitableCommand interface, seems to work fine. Shall I open a
 JIRA and send a pull request?

 Details:
 https://github.com/Sanne/infinispan/commit/ed962ed72bc68765078b6a0f172b95ea1c07d485#L7L142

 Cheers,
 Sanne

 2011-07-14 15:16:20,940 ERROR [RebalanceTask]
 (Rehasher,Infinispan-Cluster,NonStringKeyStateTransferTest-NodeC-7649)
 ISPN000145: Error during rehash
 java.lang.IllegalStateException: Default cache is in 'STOPPING' state
 and this is an invocation not belonging to an on-going transaction, so
 it does not accept new invocations. Either restart it or recreate the
 cache container.
       at 
 org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:83)
       at 
 org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:64)
       at 
 org.infinispan.commands.AbstractVisitor.visitInvalidateCommand(AbstractVisitor.java:120)
       at 
 org.infinispan.commands.AbstractVisitor.visitInvalidateL1Command(AbstractVisitor.java:124)
       at 
 org.infinispan.commands.write.InvalidateL1Command.acceptVisitor(InvalidateL1Command.java:177)
       at 
 org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:274)
       at 
 org.infinispan.distribution.RebalanceTask.invalidateKeys(RebalanceTask.java:172)
       at 
 org.infinispan.distribution.RebalanceTask.performRehash(RebalanceTask.java:145)
       at org.infinispan.distribution.RehashTask.call(RehashTask.java:67)
       at org.infinispan.distribution.RehashTask.call(RehashTask.java:44)
       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
       at java.util.concurrent.FutureTask.run(FutureTask.java:138)
       at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
       at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
       at java.lang.Thread.run(Thread.java:662)
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org




 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] IllegalStateException when receiving invalidation messages on a stopped/stopping cache

2011-07-15 Thread Dan Berindei
I was surprised because I would have thought your change would replace
each IllegalStateException in the log with an InterruptedException, so
I wouldn't have expected the log to be any less noisy.
I mean you don't stop the rehashing during invalidation, you just skip
invalidation and go to the next phase
(notifyCoordinatorPushCompleted), which I thought should throw an
InterruptedException.

True, we could ignore InterruptedExceptions in RehashTask and not log
anything, but based only on this change I was expecting the same
number of exceptions in the log. Good to know that I was wrong :)

Dan


On Fri, Jul 15, 2011 at 1:46 PM, Sanne Grinovero sa...@infinispan.org wrote:
 Sorry for not being clear - I'm not assuming that this has anything to
 do with the InterruptedException/EOFs being still logged,
 just that the output is less noisy if I remove these which where pointless.

 Sanne

 2011/7/14 Dan Berindei dan.berin...@gmail.com:
 TBH I'm not sure why the patch helped with the error messages in the
 log, I always thought that doing this would just replace the
 IllegalStateException with an InterruptedException (since do some more
 stuff after invalidation).

 Of course the best solution would be to not do rehashing at all when
 we know we're going to stop the entire cluster anyway
 (https://issues.jboss.org/browse/ISPN-1239).

 Dan


 On Thu, Jul 14, 2011 at 7:48 PM, Manik Surtani ma...@jboss.org wrote:
 Patch looks good.  Go ahead and issue a pull req.

 On 14 Jul 2011, at 16:26, Sanne Grinovero wrote:

 Hello,
 I was looking into the many stacktraces being thrown from the
 testsuite, this one caught my attention (see below).
 Can't we just discard such errors? if the cache is stopping, or
 stopped already, we shouldn't really care for invalidations -
 especially stacktraces and exceptions being returned to the caller.
 This doesn't solve all the EOF exceptions I'm still experiencing, but
 it seems to make things a bit better.. I've hacked a solution which
 implies adding a method:

 boolean ignoreCommandOnStatus(ComponentStatus status);

 to the VisitableCommand interface, seems to work fine. Shall I open a
 JIRA and send a pull request?

 Details:
 https://github.com/Sanne/infinispan/commit/ed962ed72bc68765078b6a0f172b95ea1c07d485#L7L142

 Cheers,
 Sanne

 2011-07-14 15:16:20,940 ERROR [RebalanceTask]
 (Rehasher,Infinispan-Cluster,NonStringKeyStateTransferTest-NodeC-7649)
 ISPN000145: Error during rehash
 java.lang.IllegalStateException: Default cache is in 'STOPPING' state
 and this is an invocation not belonging to an on-going transaction, so
 it does not accept new invocations. Either restart it or recreate the
 cache container.
       at 
 org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:83)
       at 
 org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:64)
       at 
 org.infinispan.commands.AbstractVisitor.visitInvalidateCommand(AbstractVisitor.java:120)
       at 
 org.infinispan.commands.AbstractVisitor.visitInvalidateL1Command(AbstractVisitor.java:124)
       at 
 org.infinispan.commands.write.InvalidateL1Command.acceptVisitor(InvalidateL1Command.java:177)
       at 
 org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:274)
       at 
 org.infinispan.distribution.RebalanceTask.invalidateKeys(RebalanceTask.java:172)
       at 
 org.infinispan.distribution.RebalanceTask.performRehash(RebalanceTask.java:145)
       at org.infinispan.distribution.RehashTask.call(RehashTask.java:67)
       at org.infinispan.distribution.RehashTask.call(RehashTask.java:44)
       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
       at java.util.concurrent.FutureTask.run(FutureTask.java:138)
       at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
       at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
       at java.lang.Thread.run(Thread.java:662)
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org




 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


[infinispan-dev] State transfer is transferring the same data twice

2011-07-15 Thread Dan Berindei
Hi guys

While fixing https://issues.jboss.org/browse/ISPN-1243 I found that in
certain cases state transfer will copy the in-memory data twice: first
when copying the in-memory data and then again while copying the
persistent data.

I was getting duplicate key exceptions in
StateTransferManagerImpl.applyPersistentState(), so I changed
StateTransferManagerImpl.applyInMemoryState() to skip writing to the
cache store if persistent state transfer is enabled and passivation is
disabled. That seems to work, but I'm wondering if there are other
cases that I missed.

For 5.1, since we're going to use the rehashing code, we can check if
a key is in memory and not include it again in the persistent state,
but that seems too intrusive to include in 5.0. WDYT?

Cheers
Dan
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] BeanUtils.getterMethod

2011-08-04 Thread Dan Berindei
On Thu, Aug 4, 2011 at 2:03 AM, Sanne Grinovero sa...@infinispan.org wrote:
 2011/8/3 Dan Berindei dan.berin...@gmail.com:
 I think it's supposed to be an automatic version of
 https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/config/GlobalConfiguration.java#L326
 I prefer the manual version in GlobalConfiguration, although I don't
 think there are any users for those components either.

 Dan

 I was thinking something similar, but would like to know if it's still
 used, somebody must know about this code?
 Initially I thought it was some clever way to extract instances from a
 user-customizable configuration instance, but since it's reflecting on
 the static definition of Configuration it doesn't look like it's meant
 to support extensions of it.

 If it's used we could cache the references to existing methods and
 track also which methods don't exist, so we limit the number of
 exceptions to a single one per attempted method name (i.e. starting
 500 caches would cost as starting one).

Even a custom Configuration could register its objects in the
component registry manually in an @Inject method, like
GlobalConfiguration does. But I don't see any good reason for
injecting bits of the configuration instead of injecting the whole
thing and navigating the configuration tree in the component code.

So I'd go with your initial suggestion and remove it.

Dan

 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


[infinispan-dev] Starting caches in parallel

2011-08-04 Thread Dan Berindei
Hi guys

I've found a deadlock with transactions spanning multiple caches
during rehashing if the joiner's caches are started sequentially (for
more details see https://gist.github.com/1124740)
After discussing a bit on IRC with Manik and Galderz it appears the
only solution for 5.0.0.FINAL would be to have a mechanism to start
all the caches in parallel.

There are several options to implement that. All of them require the
users to know that they should create the caches in parallel at
application startup to avoid problems.

1. Advise the users about the problem, but let them create their own
threads and call EmbeddedCacheManager.getCache() on those threads.


2. Add a method EmbeddedCacheManager.getCaches() to start multiple
caches in parallel.

This is not as straightforward as it may seem, first there is a
question of whether to use template parameters or not:
2.a. SetCache getCaches(String... cacheNames);
vs
2.b. SetCacheK, V getCaches(String... cacheNames);

I don't think having the same K and V for all the caches is going to
be very common, so I'd go with 2.a.

Then there is the problem of how to request the default cache. I think
 should be fine, but we'd need to document it and also change the
regular getCache() methods to accept it.


3. Add an async version of EmbeddedCacheManager.getCache():

FutureCacheK, V getCacheAsync(String cacheName);
FutureCacheK, V getCacheAsync();

This nicely sidesteps the generics issue in solution 2 and it's also
easier for the users than solution 1, so it's currently my favourite.


What do you think?


Dan
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Starting caches in parallel

2011-08-04 Thread Dan Berindei
On Thu, Aug 4, 2011 at 2:01 PM, Sanne Grinovero sa...@infinispan.org wrote:
 2011/8/4 Dan Berindei dan.berin...@gmail.com:
 Hi guys

 I've found a deadlock with transactions spanning multiple caches
 during rehashing if the joiner's caches are started sequentially (for
 more details see https://gist.github.com/1124740)
 After discussing a bit on IRC with Manik and Galderz it appears the
 only solution for 5.0.0.FINAL would be to have a mechanism to start
 all the caches in parallel.

 There are several options to implement that. All of them require the
 users to know that they should create the caches in parallel at
 application startup to avoid problems.

 1. Advise the users about the problem, but let them create their own
 threads and call EmbeddedCacheManager.getCache() on those threads.


 2. Add a method EmbeddedCacheManager.getCaches() to start multiple
 caches in parallel.

 This is not as straightforward as it may seem, first there is a
 question of whether to use template parameters or not:
 2.a. SetCache getCaches(String... cacheNames);
 vs
 2.b. SetCacheK, V getCaches(String... cacheNames);


It turns out the proper generic version should be MapString, Cache?
extends Object, ? extends Object createCaches(String...), but even
this version produces unchecked warnings.
We'd also had to make the return type a map, otherwise the user has no
way of getting a specific cache.

 I don't think having the same K and V for all the caches is going to
 be very common, so I'd go with 2.a.

 Then there is the problem of how to request the default cache. I think
  should be fine, but we'd need to document it and also change the
 regular getCache() methods to accept it.

 You could use org.infinispan.manager.CacheContainer.DEFAULT_CACHE_NAME.


I thought it was something hidden from the users, I wasn't aware it's
in the public API. This is perfect.

 3. Add an async version of EmbeddedCacheManager.getCache():

 FutureCacheK, V getCacheAsync(String cacheName);
 FutureCacheK, V getCacheAsync();

 This nicely sidesteps the generics issue in solution 2 and it's also
 easier for the users than solution 1, so it's currently my favourite.


 What do you think?

 I think it's very bad but it doesn't surprise me, since starting more
 than a single cache in sequence brings you in the scenario of
 ISPN-658,
 unless you're quick enough. So yes it's not nice, but we highlighted
 the problem since long I think?


The workaround for ISPN-658 is to start all caches on application
startup. This scenario appeared specifically because I was trying to
avoid an asymmetric cluster and moved the creation of all caches to a
central location.

 Are you proposing a temporary API to make things work before ISPN-658
 is solved? I don't like the Future approach, it's still unclear that I
 have to send all requests before blocking on any get.

The fix for ISPN-658 will most likely prevent existing nodes from
sending request to a joiner until that joiner has finished the initial
rehash, so I presume it will make my scenario impossible.
There is another JIRA planned for 5.1 that is targeted more
specifically as a workaround for ISPN-658: allow
InboundInvocationHandler to create caches automatically when it has an
incoming command for that cache (can't find the id now).

 I'd make a

 void startCaches(String... names);

 which implicitly includes the default cache too, and you throw an
 exception if getCache() is used on an unstarted cache, and
 unfortunately you should also throw an exception if startCaches() is
 invoked more than once.


I've had a long chat on IRC about this with Sanne and Pete, and our
conclusion was that adding a startCaches method is the clearest option
for the users.
The intention of this API is only to start the caches, so the name
should clearly state that. It will also eliminate an uncertainty for
users: at the end of startCaches call they will know that the caches
have started and that further getCache calls won't fail.

We can't afford to break getCache() for users though (including
ourselves), so getCache() will still start the cache. It should still
print a warning when doing so, directing people to call startCaches
instead.
I don't think we should throw an exception if startCaches is invoked
more than once either to match getCache() and all the other
components.

Cheers
Dan
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Why DistTxInterceptor in use with Hot Rod ?

2011-09-14 Thread Dan Berindei
Going back to your original question Galder, the exception is most
likely thrown because of this sequence of events:

0. Given a cluster {A, B}, a key k and a node C joining.
1. Put acquires the transaction lock on node A (blocking rehashing)
2. Put acquires lock for key k on node A
3. Rehashing starts on node B, blocking transactions
4. Put tries to acquire transaction lock on node B

Since it's impossible to finish rehashing while the put operation
keeps the transaction lock on node A, the best option was to kill the
put operation by throwing a RehashInProgressException.

I was thinking in the context of transactions when I wrote this code
(see 
https://github.com/danberindei/infinispan/commit/6ed94d3b2e184d4a48d4e781db8d404baf5915a3,
this scenario became just a footnote to the generic case with multiple
caches), but the scenario also occurs without transactions. Actually I
renamed it state transfer lock and I moved it to a separate
interceptor in my ISPN-1194 branch.

Maybe the locking changes in 5.1 will eliminate this scenario, but
otherwise we could improve the user experience by retrying the command
after the rehashing finishes.

Dan


On Tue, Sep 13, 2011 at 8:11 PM, Galder Zamarreño gal...@redhat.com wrote:

 On Sep 13, 2011, at 6:38 PM, Mircea Markus wrote:


 On 13 Sep 2011, at 17:22, Galder Zamarreño wrote:

 Hi,

 I'm looking at this failure http://goo.gl/NQw4h and I'm wondering why 
 org.infinispan.distribution.RehashInProgressException: Timed out waiting 
 for the transaction lock is thrown?

 This is thrown DistTxInterceptor which is added by the 
 InterceptorChainFactory:

     if (configuration.getCacheMode().isDistributed())
        
 interceptorChain.appendInterceptor(createInterceptor(DistTxInterceptor.class));
     else
        
 interceptorChain.appendInterceptor(createInterceptor(TxInterceptor.class));

 However, this is a Hot Rod test and the cache is not configured with a 
 transaction manager, so is DistTxInterceptor really needed?

 In fact, why a TxInterceptor if no transaction manager is configured? (this 
 kinda goes back to Mircea's email earlier today about a transactions 
 enabled/disabled flag)
 All valid concerns, IMO.
 5.1 will clarify this by not adding Tx interceptors for non tx caches.

 Is this captured in some other JIRA? I'd like to reference the test and the 
 JIRA from ISPN-1123 (stabilising testsuite jira)



 Cheers,
 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Why DistTxInterceptor in use with Hot Rod ?

2011-09-15 Thread Dan Berindei
On Thu, Sep 15, 2011 at 12:25 PM, Galder Zamarreño gal...@redhat.com wrote:

 On Sep 14, 2011, at 11:40 AM, Dan Berindei wrote:

 Going back to your original question Galder, the exception is most
 likely thrown because of this sequence of events:

 0. Given a cluster {A, B}, a key k and a node C joining.
 1. Put acquires the transaction lock on node A (blocking rehashing)
 2. Put acquires lock for key k on node A
 3. Rehashing starts on node B, blocking transactions
 4. Put tries to acquire transaction lock on node B

 Since it's impossible to finish rehashing while the put operation
 keeps the transaction lock on node A, the best option was to kill the
 put operation by throwing a RehashInProgressException.

 I was thinking in the context of transactions when I wrote this code
 (see 
 https://github.com/danberindei/infinispan/commit/6ed94d3b2e184d4a48d4e781db8d404baf5915a3,
 this scenario became just a footnote to the generic case with multiple
 caches), but the scenario also occurs without transactions. Actually I
 renamed it state transfer lock and I moved it to a separate
 interceptor in my ISPN-1194 branch.

 Right, but it shouldn't happen when transactions are off, right?


It will still happen with transactions off, because state transfer
will acquire the state transfer lock in exclusive mode and
WriteCommands will acquire the state transfer lock in shared mode
instead of PrepareCommands with transactions on.

If we don't acquire the state transfer lock than we run the risk of
pushing stale data to other nodes, are you saying that without
transactions enabled this risk is acceptable?


 Maybe the locking changes in 5.1 will eliminate this scenario, but
 otherwise we could improve the user experience by retrying the command
 after the rehashing finishes.

 I'd prefer that, otherwise users are likely to code similar logic which is 
 not nice.


After looking at the scenario again it became clear that step 2. is
not necessary at all and the locking enhancements will not change
anything.

We could change DistributionInterceptor to not hold the state transfer
lock during remote calls, which will allow rehashing to proceed on the
origin. But then the ownership of the keys might change during the
call, so we'll need a retry phase anyway.

Dan

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Versioned entries - overview of design, looking for comments

2011-09-15 Thread Dan Berindei
On Wed, Sep 14, 2011 at 7:48 PM, Sanne Grinovero sa...@infinispan.org wrote:
 Wouldn't the node performing the operation always do an RPC anyway iff
 the intended operation is to replace a specific value?


I was thinking along the same lines, if there is no value in the
InvocationContext (L0 cache?) then there's no point in doing any write
skew check.

 Examples:

  - If I do a put() operation which doesn't skip the return value, the
 RPC has to be perfomed, we get the current version value which is what
 we will check to be unchanged when committing the write operation.
 This includes putIfAbsent, and all other atomic operations, which
 are of common use and all need an RPC anyway. This means that L1
 caches will contain the version number as well.

  - If I do a put() operation in which I skip the return value, or a
 remove() without any check (nor lock), it seems the user intention is
 to overwrite/delete whatever was there without any further check at
 commit time. In fact this doesn't acquire the optimistick lock.


I see a problem with this reasoning, as either with Jokre or with the
forthcoming void put() operation in JSR-107 the user will not make a
conscious decision to skip the return value.

But I still don't think the write skew check is useful if the value
was not previously read by the user in that transaction.

 - any read operation will need an RPC anyway, (If relevant: I guess
 for this case we could have different options if to rollback or
 proceed when detecting stale reads at commit time.)


I thought the write skew check only applied to write operations. But
it would be nice to force a write skew check on a key without a write
(like an optimistic equivalent of lock()).

 - any lock() operation will force such an RPC.

 I don't think I understood the difference between #1 - #2. In both
 cases the writing node needs to retrieve the version information, and
 the key owner will perform the version increment at commit time, if
 the skew check is happy.


The way I understood it #1 forces the remote get even if the user
didn't request it, but as you said it could just skip the write skew
check in that case.
#2 looks to me like it doesn't fetch the remote version at all, but
I'm having trouble understanding how the owner will perform the write
skew check.

Dan


 Sanne


 On 14 September 2011 16:03, Manik Surtani ma...@jboss.org wrote:
 So I've been hacking on versioned entries for a bit now, and want to run the 
 designs by everyone. Adding an EntryVersion to each entry is easy, making 
 this optional and null by default easy too, and a SimpleVersion a wrapper 
 around a long and a PartitionTolerantVersion being a vector clock 
 implementation.  Also easy stuff, changing the entry hierarchy and the 
 marshalling to ensure versions - if available - are shipped, etc.

 Comparing versions would happen in Mircea's optimistic locking code, on 
 prepare, when a write skew check is done.  If running in a non-clustered 
 environment, the simple object-identity check we currently have is enough; 
 otherwise an EntryVersion.compare() will need to happen, with one of 4 
 possible results: equal, newer than, older than, or concurrently modified.  
 The last one can only happen if you have a PartitionTolerantVersion, and 
 will indicate a split brain and simultaneous update.

 Now the hard part.  Who increments the version?  We have a few options, all 
 seem expensive.

 1) The modifying node.  If the modifying node is a data owner, then easy.  
 Otherwise the modifying node *has* to do a remote GET first (or at least a 
 GET_VERSION) before doing a PUT.  Extra RPC per entry.  Sucks.

 2) The data owner.  This would have to happen on the primary data owner 
 only, and the primary data owner would need to perform the write skew check. 
  NOT the modifying node.  The modifying node would also need to increment 
 and ship its own NodeClock along with the modification. Extra info to ship 
 per commit.

 I'm guessing we go with #2, but would like to hear your thoughts.

 Cheers
 Manik

 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org




 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] TEST_PING on a Mac?

2011-09-27 Thread Dan Berindei
I've seen the same thing on Linux but a lot less often, about 4 times
in the whole test suite with DEBUG logging. I didn't see it at all
when I switched TRACE logging on.

Dan


On Mon, Sep 26, 2011 at 7:22 PM, Manik Surtani ma...@jboss.org wrote:
 Guys - has anyone seen this?  Happens on many tests when I run on OSX, but 
 works fine on Linux.

 ---
  T E S T S
 ---
 Running TestSuite
 Transaction manager used: JBossTM
 Transport protocol stack used = tcp
 2011-09-26 18:19:25,724 WARN  [GMS] (testng-TxAndInvalidationTimeoutTest) 
 join(TxAndInvalidationTimeoutTest-NodeB-47469) sent to 
 TxAndInvalidationTimeoutTest-NodeA-11323 timed out (after 7000 ms), retrying
 2011-09-26 18:19:33,241 WARN  [GMS] (testng-TxAndInvalidationTimeoutTest) 
 join(TxAndInvalidationTimeoutTest-NodeB-47469) sent to 
 TxAndInvalidationTimeoutTest-NodeA-11323 timed out (after 7000 ms), retrying
 2011-09-26 18:19:40,753 WARN  [GMS] (testng-TxAndInvalidationTimeoutTest) 
 join(TxAndInvalidationTimeoutTest-NodeB-47469) sent to 
 TxAndInvalidationTimeoutTest-NodeA-11323 timed out (after 7000 ms), retrying
 2011-09-26 18:19:48,267 WARN  [GMS] (testng-TxAndInvalidationTimeoutTest) 
 join(TxAndInvalidationTimeoutTest-NodeB-47469) sent to 
 TxAndInvalidationTimeoutTest-NodeA-11323 timed out (after 7000 ms), retrying
 2011-09-26 18:19:55,784 WARN  [GMS] (testng-TxAndInvalidationTimeoutTest) 
 join(TxAndInvalidationTimeoutTest-NodeB-47469) sent to 
 TxAndInvalidationTimeoutTest-NodeA-11323 timed out (after 7000 ms), retrying
 2011-09-26 18:20:03,316 WARN  [GMS] (testng-TxAndInvalidationTimeoutTest) 
 join(TxAndInvalidationTimeoutTest-NodeB-47469) sent to 
 TxAndInvalidationTimeoutTest-NodeA-11323 timed out (after 7000 ms), retrying
 2011-09-26 18:20:11,156 WARN  [GMS] (testng-TxAndInvalidationTimeoutTest) 
 join(TxAndInvalidationTimeoutTest-NodeB-47469) sent to 
 TxAndInvalidationTimeoutTest-NodeA-11323 timed out (after 7000 ms), retrying
 2011-09-26 18:20:19,298 WARN  [GMS] (testng-TxAndInvalidationTimeoutTest) 
 join(TxAndInvalidationTimeoutTest-NodeB-47469) sent to 
 TxAndInvalidationTimeoutTest-NodeA-11323 timed out (after 7000 ms), retrying
 2011-09-26 18:20:26,833 WARN  [GMS] (testng-TxAndInvalidationTimeoutTest) 
 join(TxAndInvalidationTimeoutTest-NodeB-47469) sent to 
 TxAndInvalidationTimeoutTest-NodeA-11323 timed out (after 7000 ms), retrying
 2011-09-26 18:20:34,347 WARN  [GMS] (testng-TxAndInvalidationTimeoutTest) 
 join(TxAndInvalidationTimeoutTest-NodeB-47469) sent to 
 TxAndInvalidationTimeoutTest-NodeA-11323 timed out (after 7000 ms), retrying
 2011-09-26 18:20:41,859 WARN  [GMS] (testng-TxAndInvalidationTimeoutTest) 
 join(TxAndInvalidationTimeoutTest-NodeB-47469) sent to 
 TxAndInvalidationTimeoutTest-NodeA-11323 timed out (after 7000 ms), retrying
 2011-09-26 18:20:49,366 WARN  [GMS] (testng-TxAndInvalidationTimeoutTest) 
 join(TxAndInvalidationTimeoutTest-NodeB-47469) sent to 
 TxAndInvalidationTimeoutTest-NodeA-11323 timed out (after 7000 ms), retrying
 2011-09-26 18:20:56,884 WARN  [GMS] (testng-TxAndInvalidationTimeoutTest) 
 join(TxAndInvalidationTimeoutTest-NodeB-47469) sent to 
 TxAndInvalidationTimeoutTest-NodeA-11323 timed out (after 7000 ms), retrying
 2011-09-26 18:21:04,416 WARN  [GMS] (testng-TxAndInvalidationTimeoutTest) 
 join(TxAndInvalidationTimeoutTest-NodeB-47469) sent to 
 TxAndInvalidationTimeoutTest-NodeA-11323 timed out (after 7000 ms), retrying
 2011-09-26 18:21:12,249 WARN  [GMS] (testng-TxAndInvalidationTimeoutTest) 
 join(TxAndInvalidationTimeoutTest-NodeB-47469) sent to 
 TxAndInvalidationTimeoutTest-NodeA-11323 timed out (after 7000 ms), retrying



 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org




 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] ISPN-1384 - InboundInvocationHandlerImpl should wait for cache to be started? (not just defined)

2011-09-27 Thread Dan Berindei
On Tue, Sep 27, 2011 at 5:59 PM, Galder Zamarreño gal...@redhat.com wrote:
 Hi,

 Re: https://issues.jboss.org/browse/ISPN-1384

 I've had a look to this and this race condition could, in theory, be resolved 
 by making InboundInvocationHandlerImpl.handle() waiting for cache not only to 
 be defined, but to be started. Otherwise there'll always be a potential race 
 condition like the one showed in the log.


Actually I think it would be enough to wait until the cache has
started joining (StateTransferManager.waitForJoinToStart()), that
means all the other components have finished starting.

With the asymmetric cache support in place I think we shouldn't have
to wait at all, since we'll send the join request only after all the
components have been started. We could either apply the commands (if
we implement the non-blocking state transfer option) or reject them
and tell the originator to retry after the state transfer is done (if
we keep the blocking state transfer, since the other nodes shouldn't
be sending commands to us anyway).

 In this particular case, this is clustered get command being received from a 
 clustered cache loader, which is arriving in the cache before this is started 
 (and the cache loader has been created, hence the NPE).

 Another question, is there any reason why CacheLoader is not a named cache 
 component which can be initalised with a corresponding factory and to which 
 other components can be injected (i.e. marshaller, cache...etc)?

 In this particular case, this would also resolve the issue because 
 ClusterCacheLoader.start() does nothing, so all the interceptor needs is a 
 proper instance of ClusterCacheLoader available. The factory makes these 
 available bejore inject.

 Thoughts?

 Cheers,

 p.s. Dan, I am aware of https://issues.jboss.org/browse/ISPN-1324, maybe 
 you're solving this indirectly with the work for that JIRA?

With asymmetric caches in place I don't think we'll need/want
ISPN-1324 any more.

Cheers
Dan


 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Asymmetric caches and manual rehashing design

2011-09-28 Thread Dan Berindei
On Wed, Sep 28, 2011 at 12:59 PM, Bela Ban b...@redhat.com wrote:
 My 5 cents:

 Virtual Cache Views:
 


 - The REQUEST_VIEW_UPDATE is a good idea to manually trigger flushing of
 pending views. However, I'm not sure what the goal of
 DISABLE_VIEW_UPDATES is. Since I assume this is for the entire cluster,
 can 1 member disable view delivery for the entire cluster ? What
 scenario is this for ?


Indeed, one member could disable view delivery for the entire cluster.
The main scenario I had in mind for this was stopping the entire
cluster. A good rebalancing policy might handle such cases
automatically, so I'm not 100% sure we need this.

 - Are you clubbing (virtual) view updates and rebalancing together ? And
 if so (I should probably read on first...), can't you have view
 installations *without* rebalancing ?


I'm not sure what benefits you would get from joining the cache view
but not receiving state compared to not sending the join request at
all. Non-members are allowed to send/receive commands, so in the
future we could even have separate server nodes (that join the cache
view) and client nodes (joining the JGroups cluster to send
commands, but not the cache view and so not holding any data except
L1).

My idea was that the cache view was a representation of the caches
that are able to service requests, so it doesn't make sense to include
in the view caches that don't hold data.

As a practical note, if we add nodes to the consistent hash as they
come along, a key will have one set of owners if written to before the
join and another set of owner if written to after the join. How should
another node determine the current location of the key, especially
considering that there can be many view changes before rebalancing is
triggerred?

 - Do we need the complex PREPARE_VIEW / ROLLBACK_VIEW / COMMIT_VIEW 2PC
 handling ? This adds a lot of complexity. Is it only used when we have a
 transactional cache ?


Nope, this doesn't have anything to do with transactional caches,
instead it is all about computing the owner that will push the key
during the rebalance operation.

In order to do it deterministically we need to have a common last
good consistent hash for the last rebalance that finished
successfully, and each node must determine if it should push a key or
not based on that last good CH.

A rebalance operation can also fail for various reasons (e.g. the
coordinator died). If that happens the new owners won't have all the
state, so they should not receive requests for the state that they
would have had in the pending CH.

 - State is to be transferred *within* this 2PC time frame. Hmm, again,
 this ties rebalancing and view installation together (see my argument
 above)...


If view installation wasn't tied to state transfer then we'd have to
keep yet the last rebalanced view somewhere else. We would hold the
last pending view (pending rebalance, that is) in the
CacheViewsManager and the last rebalanced view in another component,
and that component would have it's own mechanism for synchronizing the
last rebalanced view among cache members. So I think the 2PC
approach in CacheViewsManager actually simplifies things.


 On 9/27/11 6:22 PM, Dan Berindei wrote:
 Following the discussions last week I've written up a wiki page
 describing the strategies for cache view management and state transfer
 that will enable asymmetric caches and manual rehashing:
 http://community.jboss.org/wiki/AsymmetricCachesAndManualRehashingDesign

 The state transfer part is not very detailed, as you'll see we want to
 go with a non-blocking approach but I'm not sure we can finish that
 for 5.1 so we're keeping a blocking fallback option.

 Your comments are welcome, either on the list or on the wiki page.


 --
 Bela Ban
 Lead JGroups (http://www.jgroups.org)
 JBoss / Red Hat
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


Cheers
Dan
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Asymmetric caches and manual rehashing design

2011-09-28 Thread Dan Berindei
On Sep 28, 2011 5:24 PM, Mircea Markus mircea.mar...@jboss.com wrote:

 Nice stuff.

 Some q:

 - When a node CM that has c1 and c2 caches running on it goes down, you'll
end up having  REQUEST_LEAVE(c1) and REQUEST_LEAVE(c2) at the same time. Is
this handled in serial or in parallel?


In parallel, although that shouldn't matter because the rebalance is
triggerred asynchronously.

 - The view updates can be disabled dynamically (DISABLE_VIEW_UPDATES).
Shouldn't we also be able to re-enabled them manually after that?

 The view accept handler updates the pending view acquires the view change
lock in exclusive mode, then releases it immediately
 Shouldn't this be:
 - acquire exclusive lock
 - updates the pending view
 - release lock
 ?

Right, what I meant to say is that it doesn't hold the lock for the entire
duration of the rebalance task.


 On 27 Sep 2011, at 17:22, Dan Berindei wrote:

 Following the discussions last week I've written up a wiki page
 describing the strategies for cache view management and state transfer
 that will enable asymmetric caches and manual rehashing:
 http://community.jboss.org/wiki/AsymmetricCachesAndManualRehashingDesign

 The state transfer part is not very detailed, as you'll see we want to
 go with a non-blocking approach but I'm not sure we can finish that
 for 5.1 so we're keeping a blocking fallback option.

 Your comments are welcome, either on the list or on the wiki page.

 Dan
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev



 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Asymmetric caches and manual rehashing design

2011-09-30 Thread Dan Berindei
On Thu, Sep 29, 2011 at 10:01 AM, Bela Ban b...@redhat.com wrote:


 On 9/28/11 1:48 PM, Dan Berindei wrote:
 On Wed, Sep 28, 2011 at 12:59 PM, Bela Banb...@redhat.com  wrote:
 My 5 cents:

 - Are you clubbing (virtual) view updates and rebalancing together ? And
 if so (I should probably read on first...), can't you have view
 installations *without* rebalancing ?


 I'm not sure what benefits you would get from joining the cache view
 but not receiving state compared to not sending the join request at
 all. Non-members are allowed to send/receive commands, so in the
 future we could even have separate server nodes (that join the cache
 view) and client nodes (joining the JGroups cluster to send
 commands, but not the cache view and so not holding any data except
 L1).


 I had the scenario in mind where you join 100 members and only *then* do
 a state transfer (rebalancing).



 My idea was that the cache view was a representation of the caches
 that are able to service requests, so it doesn't make sense to include
 in the view caches that don't hold data.


 OK. So with periodic rebalancing, you'd hold (virtual) views and state
 *until* the trigger fires, which then installs the new virtual views and
 rebalances the state ? In this case, tying view delivery and rebalancing
 together makes sense...


Yes, I'm installing the views only when the trigger fires, and
installing the views also will also include transferring state.
So the 100 members might ask the coordinator to join the virtual view,
but they will only actually join (and receive state) when the new view
that includes them is installed.



 - Do we need the complex PREPARE_VIEW / ROLLBACK_VIEW / COMMIT_VIEW 2PC
 handling ? This adds a lot of complexity. Is it only used when we have a
 transactional cache ?


 Nope, this doesn't have anything to do with transactional caches,
 instead it is all about computing the owner that will push the key
 during the rebalance operation.

 In order to do it deterministically we need to have a common last
 good consistent hash for the last rebalance that finished
 successfully, and each node must determine if it should push a key or
 not based on that last good CH.


 OK. I just hope this makes sense for large clusters, as it is a 2PC,
 which doesn't scale to a larger number of nodes. I mean, we don't use
 FLUSH in large clusters for the same reason.
 Hmm, on the upside, you don't run this algorithm a lot though, so maybe
 running it only a few times amortizes the cost of it.


Yeah, it is kind of bad with large clusters, at least with the
blocking state transfer we have now. But in order to ensure we don't
lose data we need to serialize the rebalance operations, and I don't
think we can do that without 2PC.

If we switch to the non-blocking state transfer proposal it shouldn't
matter any more, the installation of the view will be delayed until we
get the consensus but transactions will still be able to proceed while
a view is in the prepare phase.


 With this algorithm, I assume you won't need the transitory view anymore
 (UnionConsistentHashFunction or whatever it was called), which includes
 both current and new owners of a key ?


We are going to need something similar: since we're waiting for the
coordinator to tell us to install a new CH, our CH may contain nodes
that are already stopped. So we will have to check the addresses
returned by the CH against the list of nodes still running.

But other than that we are just going to use the CH of the latest
committed view (with blocking state transfer) or the CH of the latest
pending view (with non-blocking state transfer). With blocking state
transfer we just know that the old owners will hold the key until the
new view is committed. The non-blocking state transfer proposal works
by letting the new owners fetch the key from the old owners, so there
is no need for the callers to fall back to the previous owners.

Cheers
Dan

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Why no use async listener executor?

2011-10-06 Thread Dan Berindei
Galder, I didn't know about @Listener(sync = false), but a separate
executor makes sense for state transfer because I didn't want
concurrent state transfers and I could set it up to automatically
cancels any state transfer task that's waiting in the queue.

For the StaleTransactionCleanup I suppose we could use an async
listener but it would definitely need a bigger thread pool because the
user might have async listeners as well.


If only you'd have sent this email one day before, I've been been
trying to reuse the async transport executor in my cache views
manager, but I didn't know it's also configured to use 1 thread max
and I spent a good part of yesterday trying to understand what's going
on :)

I eventually moved on to create my own executor because I also wanted
to name my threads better and I couldn't find a simple way to include
the node name in the async transport executor's thread names, but I'm
pretty sure 1 is not a reasonable default for the async transport
either.

Cheers
Dan


On Wed, Oct 5, 2011 at 6:53 PM, Galder Zamarreño gal...@redhat.com wrote:
 Hey guys,

 Re: https://issues.jboss.org/browse/ISPN-1396

 While looking into this, I've discovered that we have been creating executors 
 in cache level components, where we're calling submit from @Listener 
 implementations.

 For example, BaseStateTransferManagerImpl.ViewChangeListener submits a rehash 
 task and TransactionTable.StaleTransactionCleanup submits a tasks to break 
 locks.

 Did the people that wrote these listeners (Dan  Mircea respectively) 
 consider defining listeners to be called asynchronously? i.e. @Listener(sync 
 = false)

 Unless you have very strict requirements, the async listener executor should 
 just work, and would lead to reuse of executors which results in lower 
 consumption.

 The same applies to the singleton store executor btw.

 Surely, if we were to expand on to other use cases, async notification 
 executor should have more than 1 max thread by default, otherwise we could 
 easily hold up tasks.

 Cheers,
 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Infinispan 5.1 + AS 7.1

2011-10-14 Thread Dan Berindei
Hi Paul

Yeah, I have changed replicated mode's state transfer to transfer
state in parallel from all the existing nodes and I'm using a
consistent hash to determine which keys should be pushed by each node.
This one looks tricky, the transport reports 0 members but it should
always have at least the current node as a member.

I'll try to run with REPL_ASYNC to see if it changes anything, but I
don't think it should.

Cheers
Dan


On Fri, Oct 14, 2011 at 1:27 AM, Paul Ferraro paul.ferr...@redhat.com wrote:
 Hey all,

 I'm a bit stuck with the Infinispan 5.1 upgrade in AS 7.1.
 I've tried both with BETA1 and a SNAPSHOT build from today.

 When a deployment forces a cache to start, I see the following
 stacktrace:

 14:06:07,584 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-8) 
 MSC1: Failed to start service jboss.infinispan.web.repl: 
 org.jboss.msc.service.StartException in service jboss.infinispan.web.repl: 
 Failed to start service
        at 
 org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1780)
  [jboss-msc-1.0.1.GA.jar:1.0.1.GA]
        at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  [:1.6.0_23]
        at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  [:1.6.0_23]
        at java.lang.Thread.run(Thread.java:679) [:1.6.0_23]
 Caused by: org.infinispan.CacheException: Unable to invoke method private 
 void org.infinispan.statetransfer.BaseStateTransferManagerImpl.start() throws 
 java.lang.Exception on object
        at 
 org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:205)
        at 
 org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:825)
        at 
 org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:624)
        at 
 org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:527)
        at 
 org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:177)
        at org.infinispan.CacheImpl.start(CacheImpl.java:462)
        at 
 org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:574)
        at 
 org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:453)
        at 
 org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:467)
        at 
 org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:84)
        at 
 org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:73)
        at 
 org.jboss.as.clustering.infinispan.subsystem.CacheService.start(CacheService.java:73)
        at 
 org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1824)
  [jboss-msc-1.0.1.GA.jar:1.0.1.GA]
        at 
 org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1759)
  [jboss-msc-1.0.1.GA.jar:1.0.1.GA]
        ... 3 more
 Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
 [:1.6.0_23]
        at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
 [:1.6.0_23]
        at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  [:1.6.0_23]
        at java.lang.reflect.Method.invoke(Method.java:616) [:1.6.0_23]
        at 
 org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:203)
        ... 16 more
 Caused by: java.lang.IllegalArgumentException: Invalid cache list for 
 consistent hash: []
        at 
 org.infinispan.distribution.ch.AbstractWheelConsistentHash.setCaches(AbstractWheelConsistentHash.java:96)
        at 
 org.infinispan.distribution.ch.ConsistentHashHelper.createConsistentHash(ConsistentHashHelper.java:122)
        at 
 org.infinispan.statetransfer.ReplicatedStateTransferManagerImpl.createConsistentHash(ReplicatedStateTransferManagerImpl.java:56)
        at 
 org.infinispan.statetransfer.BaseStateTransferManagerImpl.start(BaseStateTransferManagerImpl.java:143)

 What's particularly puzzling is that this is a REPL_ASYNC cache with
 state transfer enabled.  Why are we attempting to create a consistent
 hash here?  Dan, is this related to your work?

 Thanks in advance,

 Paul

 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] getting ISPN config as XML via JMX

2011-10-17 Thread Dan Berindei
On Fri, Oct 14, 2011 at 7:40 PM, Pete Muir pm...@redhat.com wrote:
 What is the use case for that method? I've never know anyone actually want to 
 export a running config as XML. So I was planning to loose it.


It may be useful to take the configuration out of a cache configured
in AS' ISPN subsystem and run it in standalone mode to isolate bugs...


Cheers
Dan
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Infinispan 5.1 + AS 7.1

2011-10-17 Thread Dan Berindei
I'm afraid I can't reproduce it here and I don't know of any situation
where the transport could return 0 members.

Could I get your code and debug it on my machine?

Cheers
Dan


On Mon, Oct 17, 2011 at 5:25 PM, Paul Ferraro paul.ferr...@redhat.com wrote:
 Hey Dan,

 Any progress on this?  I'd like to push the Infinispan/JGroups upgrade
 this week if possible, but this is currently blocking my progress.
 Thanks,

 Paul

 On Fri, 2011-10-14 at 15:36 +0300, Dan Berindei wrote:
 Hi Paul

 Yeah, I have changed replicated mode's state transfer to transfer
 state in parallel from all the existing nodes and I'm using a
 consistent hash to determine which keys should be pushed by each node.
 This one looks tricky, the transport reports 0 members but it should
 always have at least the current node as a member.

 I'll try to run with REPL_ASYNC to see if it changes anything, but I
 don't think it should.

 Cheers
 Dan


 On Fri, Oct 14, 2011 at 1:27 AM, Paul Ferraro paul.ferr...@redhat.com 
 wrote:
  Hey all,
 
  I'm a bit stuck with the Infinispan 5.1 upgrade in AS 7.1.
  I've tried both with BETA1 and a SNAPSHOT build from today.
 
  When a deployment forces a cache to start, I see the following
  stacktrace:
 
  14:06:07,584 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-8) 
  MSC1: Failed to start service jboss.infinispan.web.repl: 
  org.jboss.msc.service.StartException in service jboss.infinispan.web.repl: 
  Failed to start service
         at 
  org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1780)
   [jboss-msc-1.0.1.GA.jar:1.0.1.GA]
         at 
  java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   [:1.6.0_23]
         at 
  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   [:1.6.0_23]
         at java.lang.Thread.run(Thread.java:679) [:1.6.0_23]
  Caused by: org.infinispan.CacheException: Unable to invoke method private 
  void org.infinispan.statetransfer.BaseStateTransferManagerImpl.start() 
  throws java.lang.Exception on object
         at 
  org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:205)
         at 
  org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:825)
         at 
  org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:624)
         at 
  org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:527)
         at 
  org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:177)
         at org.infinispan.CacheImpl.start(CacheImpl.java:462)
         at 
  org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:574)
         at 
  org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:453)
         at 
  org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:467)
         at 
  org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:84)
         at 
  org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:73)
         at 
  org.jboss.as.clustering.infinispan.subsystem.CacheService.start(CacheService.java:73)
         at 
  org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1824)
   [jboss-msc-1.0.1.GA.jar:1.0.1.GA]
         at 
  org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1759)
   [jboss-msc-1.0.1.GA.jar:1.0.1.GA]
         ... 3 more
  Caused by: java.lang.reflect.InvocationTargetException
         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
  [:1.6.0_23]
         at 
  sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   [:1.6.0_23]
         at 
  sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   [:1.6.0_23]
         at java.lang.reflect.Method.invoke(Method.java:616) [:1.6.0_23]
         at 
  org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:203)
         ... 16 more
  Caused by: java.lang.IllegalArgumentException: Invalid cache list for 
  consistent hash: []
         at 
  org.infinispan.distribution.ch.AbstractWheelConsistentHash.setCaches(AbstractWheelConsistentHash.java:96)
         at 
  org.infinispan.distribution.ch.ConsistentHashHelper.createConsistentHash(ConsistentHashHelper.java:122)
         at 
  org.infinispan.statetransfer.ReplicatedStateTransferManagerImpl.createConsistentHash(ReplicatedStateTransferManagerImpl.java:56)
         at 
  org.infinispan.statetransfer.BaseStateTransferManagerImpl.start(BaseStateTransferManagerImpl.java:143)
 
  What's particularly puzzling is that this is a REPL_ASYNC cache with
  state transfer enabled.  Why are we attempting to create a consistent
  hash here?  Dan, is this related to your work?
 
  Thanks

Re: [infinispan-dev] strictly not returning expired values

2011-10-17 Thread Dan Berindei
On Mon, Oct 17, 2011 at 7:51 PM, Sanne Grinovero sa...@infinispan.org wrote:
 I've noticed that every time we perform a get() operation (or any
 other retrieval) the value is NOT returned to the client if it's
 expired, which is going to receive a null instead.

 It looks like that these checks are in place to prevent
 a) a race condition with the eviction process
 b) to not return very old values from a very large wakeUpInterval
 between eviction activity
 c) to load from cacheLoaders

 Now for the first case, as a user of a caching library I would prefer
 to get the value instead. Expiry is in place only to prevent me from
 wasting memory, but this entry wasn't collected yet so why is
 Infinispan denying me access to the value I want to retrieve? We're
 introducing unneeded cache misses.

 Since even with the current code we can not provide guarantees that we
 won't ever return an expired entry (between  the check and the actual
 return to user code we might be briefly suspended), I don't see why we
 should not make it clear that the expiry timeout is a best-effort time
 and relax our code checks a bit; there are many reasons to do so:

 1) avoid fictitious cache misses.

 2) Avoid the need to invoke a System.currentTimeMillis() for each
 retrieval operation - that's expensive!

 3) Some operations like size() or values() now explicitly iterate the
 whole result set twice to make sure all expired entries are removed.
 This is unnecessary following the reasoning above.

 Regarding the other cases:
 b) We don't need to care imo. If people need more precision they
 should use a lower wakeUpInterval.

I don't think this is an option, as a 5 seconds wakeUpInterval was
deemed to short not long ago:
https://source.jboss.org/viewrep/Infinispan/core/src/main/java/org/infinispan/config/Configuration.java?r1=9fe9a8cb6e49308b9e45f09b2f5324a7daefdee3r2=9cda361dc6df7bcecde95abc31fcbab11f734360

 c) We should keep this logic in CacheLoader, just because having an
 aggressive wakeUpInterval might be very expensive in this case.

 I'm asking as I've built a quick POC for ISPN-1459 but since now the
 current time needs to be passed to the entry, I end up needing to
 invoke the wall clock even for immortal cache entries making many
 other use cases slower: not happy with that, I'd rather speed up both
 use cases.


I guess you could pass in a Clock that lazily calls
System.currentTimeMillis() and saves it for subsequent calls, but that
will complicate things again.

Dan


 Sanne
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Time measurement and expiry

2011-10-18 Thread Dan Berindei
On Tue, Oct 18, 2011 at 1:32 AM, Mircea Markus mircea.mar...@jboss.com wrote:

 On 17 Oct 2011, at 14:13, Sanne Grinovero wrote:

 Very interesting, I knew that in Windows currentTimeMillis() basically
 just reads a volatile because I got bit by the 15 millisecond accuracy
 issue before, so I thought it would always be very fast. I had no idea
 on Linux it would have the same performance as nanoTime().
 Indeed very nice!
 I miss the part where the article says nanoTime has the same performance on 
 modern Linux as currentTimeMillis, are you sure?

Yeah, the article didn't talk about Linux but I found this article:
http://blogs.oracle.com/ksrini/entry/we_take_java_performance_very
There's also this JDK bug complaining about both being slow:
http://bugs.sun.com/view_bug.do?bug_id=6876279  (the test output is in
a weird format, ignore the 1st number, the 2nd one is nanos/call).
Except when I actually ran the code in the bug report I my timings
were much better than those reported in the bug description:

java version 1.6.0_22
OpenJDK Runtime Environment (IcedTea6 1.10.3) (fedora-59.1.10.3.fc15-x86_64)
OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)
Intel(R) Core(TM) i5 CPU   M 540  @ 2.53GHz

currentTimeMillis: 36ns
nanoTime: 28ns

-server or -XX:AggressiveOpts don't seem to make a difference.


I also ran the test on cluster10 and the results are slightly worse,
but not significantly:

java version 1.6.0_17
OpenJDK Runtime Environment (IcedTea6 1.7.10) (rhel-1.39.b17.el6_0-x86_64)
OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)
Intel(R) Xeon(R) CPU   E5640  @ 2.67GHz

currentTimeMillis: 40ns
nanoTime: 35ns


It would be interesting if we could run the test on all our machines
and see how the timings vary by machine and OS.


It seems we're not the only ones with this problem either, Oracle (the
database) apparently calls gettimeofday() a lot so RHEL has some
optimizations to remove the system call overhead and make it even
faster (more like Windows I presume, but I don't have a Windows
machine on hand to confirm):
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_MRG/1.3/html/Realtime_Tuning_Guide/sect-Realtime_Tuning_Guide-General_System_Tuning-gettimeofday_speedup.html
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_MRG/1.3/html/Realtime_Tuning_Guide/sect-Realtime_Tuning_Guide-Realtime_Specific_Tuning-RT_Specific_gettimeofday_speedup.html
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Reminders about GIT

2011-10-18 Thread Dan Berindei
Nice!

On Tue, Oct 18, 2011 at 6:16 PM, Sanne Grinovero sa...@infinispan.org wrote:
 Almost forgot the main tip; use:

 git config branch.master.mergeoptions --ff-only

 It will prevent a non-linear creating merge with an error as

 fatal. Not possible to fast-forward, aborting.

 Cheers,
 Sanne

 On 18 October 2011 16:09, Sanne Grinovero sa...@infinispan.org wrote:
 1) If you have warnings about Merge made by recursive you have to
 fix it rebasing.

 2) If you have warnings about non-fast-forward you have to rebase.

 3) If you see non-fast-forward updates were rejected you shall never
 use force on upstream! It means that another patch was merged before
 you and you have to update your master again, and rebase again.

 4) force is allowed only in special maintenance circumstances. If
 you find you're needing it to handle a pull request, then you're doing
 it wrong, and the mistake might be a dangerous one. It's like the good
 rule of never commit when you're drunk (coding is allowed).

 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Infinispan 5.1.0.BETA2 is out with asymmetric cluster support

2011-10-19 Thread Dan Berindei
Nice writeup Galder!


On Wed, Oct 19, 2011 at 10:54 AM, Manik Surtani ma...@jboss.org wrote:
 Nice one!

 On 19 Oct 2011, at 08:51, Galder Zamarreño wrote:

 Read all about it in http://goo.gl/C6BtO

 Cheers,
 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org




 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Time measurement and expiry

2011-10-19 Thread Dan Berindei
On Tue, Oct 18, 2011 at 10:57 AM, Sanne Grinovero sa...@infinispan.org wrote:
 The replies so far are very interesting, but at this stage I'd be more
 interested in discussing the fundamental question:

 Why are we preferring to provide more cache misses  slower
 performance, to fake a slightly better precision in eviction?


I guess this is the main question, how much worse would the eviction
precision be if we only relied on the periodic eviction thread?

I would agree with you if we could guarantee a constant precision in
eviction, but with the current algorithm the eviction interval has to
grow with the cache size so we can't offer any guarantees. And that's
even without considering CacheLoaders.

Elias' suggestion of using a heap for eviction is interesting, but
adding entries to the heap will slow puts. We could try a combined
approach, every minute scan the data container and add the entries
that will expire in the next minute to a heap. Only puts that expire
before the deadline will add the entry to the eviction heap. Then
another eviction thread can evict entries from the heap at a higher
frequency.

I'm sure there are better ideas, but I we'd have to prototype them and
show that we can keep the staleness under an upper bound (say 1
second) for reasonably large caches before changing the algorithm.

 Since we can't guarantee precision, I'd want to actually remove these
 checks altogether: is there any use case requiring high precision in
 the eviction process?
 I hope not, I wouldn't suggest to rely on Infinispan as a reliable clock.


Relying on (relatively) precise expiration is the kind of implicit
assumption that people make and don't even realize they depend on it
until an upgrade breaks their application.

I'm curious if other cache impls offer any guarantees in this area.

 In the couple of cases we should really keep this logic - likely the
 CacheLoader - we can discuss how to optimize this. For example I like
 Dan's idea of introducing a Clock component; we could explore such
 solutions for the remaining bits which will still need a time source
 but I'd first want to remove the main bottleneck.

 --Sanne


 On 18 October 2011 07:57, Dan Berindei dan.berin...@gmail.com wrote:
 On Tue, Oct 18, 2011 at 1:32 AM, Mircea Markus mircea.mar...@jboss.com 
 wrote:

 On 17 Oct 2011, at 14:13, Sanne Grinovero wrote:

 Very interesting, I knew that in Windows currentTimeMillis() basically
 just reads a volatile because I got bit by the 15 millisecond accuracy
 issue before, so I thought it would always be very fast. I had no idea
 on Linux it would have the same performance as nanoTime().
 Indeed very nice!
 I miss the part where the article says nanoTime has the same performance on 
 modern Linux as currentTimeMillis, are you sure?

 Yeah, the article didn't talk about Linux but I found this article:
 http://blogs.oracle.com/ksrini/entry/we_take_java_performance_very
 There's also this JDK bug complaining about both being slow:
 http://bugs.sun.com/view_bug.do?bug_id=6876279  (the test output is in
 a weird format, ignore the 1st number, the 2nd one is nanos/call).
 Except when I actually ran the code in the bug report I my timings
 were much better than those reported in the bug description:

 java version 1.6.0_22
 OpenJDK Runtime Environment (IcedTea6 1.10.3) (fedora-59.1.10.3.fc15-x86_64)
 OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)
 Intel(R) Core(TM) i5 CPU       M 540  @ 2.53GHz

 currentTimeMillis: 36ns
 nanoTime: 28ns

 -server or -XX:AggressiveOpts don't seem to make a difference.


 I also ran the test on cluster10 and the results are slightly worse,
 but not significantly:

 java version 1.6.0_17
 OpenJDK Runtime Environment (IcedTea6 1.7.10) (rhel-1.39.b17.el6_0-x86_64)
 OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)
 Intel(R) Xeon(R) CPU           E5640  @ 2.67GHz

 currentTimeMillis: 40ns
 nanoTime: 35ns


 It would be interesting if we could run the test on all our machines
 and see how the timings vary by machine and OS.


 It seems we're not the only ones with this problem either, Oracle (the
 database) apparently calls gettimeofday() a lot so RHEL has some
 optimizations to remove the system call overhead and make it even
 faster (more like Windows I presume, but I don't have a Windows
 machine on hand to confirm):
 http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_MRG/1.3/html/Realtime_Tuning_Guide/sect-Realtime_Tuning_Guide-General_System_Tuning-gettimeofday_speedup.html
 http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_MRG/1.3/html/Realtime_Tuning_Guide/sect-Realtime_Tuning_Guide-Realtime_Specific_Tuning-RT_Specific_gettimeofday_speedup.html
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman

Re: [infinispan-dev] JBoss Logging upgrade breaking build

2011-10-24 Thread Dan Berindei
David, I've upgraded to 3.1.0.Beta3 and I'm getting the same errors in
the query module:

[INFO] diagnostic error: All message bundles and message logger
messageMethods must have or inherit a message.
...
[ERROR] error on execute: error during compilation

Dan


On Mon, Oct 24, 2011 at 2:14 AM, David M. Lloyd david.ll...@redhat.com wrote:
 Okay, I've gotten to the root of the build issue.  If you update your
 jboss-logging to 3.1.0.Beta3, and do a clean rebuild, the problem should
 go away.

 The issue was caused by me changing the retention on the annotations -
 cross-artifact interface hierarchies would stop working in this case.

 On 10/23/2011 01:59 PM, David M. Lloyd wrote:
 On 10/23/2011 03:20 AM, Galder Zamarreño wrote:
 Hi,

 This JBoss Logging upgrade is rather annoying for a couple of
 reasons:

 1. IDE integration is broken: Before we had a JBoss Logging Processor
 as some dependency that allowed the logging processor to be found in
 the classpath. Now this is no longer in the classpath, so IntelliJ
 forces you to point to a jar. However, the new processing jar is
 split between two jars:
 ./jboss-logging-processor/1.0.0.CR3/jboss-logging-processor-1.0.0.CR3.jar
 and
 ./jboss-logging-generator/1.0.0.CR3/jboss-logging-generator-1.0.0.CR3.jar
 - And the generation won't work without both jars being pointed by
 the annotation processor. David, why did you split these jars?

 We split the JARs at the request of the Seam Solder project who wanted
 to reuse the generator.  It should however still work just fine if you
 add both JARs as provided dependencies; I haven't had a problem here,
 even with IDEA.

 James and I have been discussing the ultimate wisdom of this strategy
 though.  We might consider re-merging the JARs if it proves to cause
 insurmountable issues (they aren't getting *that* much value from our
 code, really).  But I'm really surprised this is turning out to be a
 serious issue.

 2. It breaks our build, see
 https://infinispan.ci.cloudbees.com/job/Infinispan-master-JDK6-tcp/268/org.infinispan$infinispan-query/console
 - What are these errors about? And why is it an info it breaks the
 compilation? :) [INFO] diagnostic error: All message bundles and
 message logger messageMethods must have or inherit a message.

 I've never seen this before.  Maybe this error is a result of extending
 the other message bundle interface, that isn't in the same artifact?
 We'll see about putting in some more tests for this case.

 What is wrong with
 https://github.com/infinispan/infinispan/blob/master/query/src/main/java/org/infinispan/query/logging/Log.java
 ?

 Cheers,

 p.s. We can we please test these upgrades throughfully before
 committing them? Thx :)

 Of course we test, but of course things can be missed.  In particular
 I've integrated the new version with several projects without any issues.



 --
 - DML
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Preloading from disk versus state transfer Re: ISPN-1384 - InboundInvocationHandlerImpl should wait for cache to be started? (not just defined)

2011-10-24 Thread Dan Berindei
ISPN-1470 (https://issues.jboss.org/browse/ISPN-1470) raises an
interesting question: if the preloading happens before joining, the
preloading code won't know anything about the consistent hash. It will
load everything from the cache store, including the keys that are
owned by other nodes.

I think there is a check in place already so that the joiner won't
push stale data from its cache store to the other nodes, but we should
also discard the keys that don't map locally or we'll have stale data
(since we don't have a way to check if those keys are stale and
register to receive invalidations for those keys).

What do you think, should I discard the non-local keys with the fix
for ISPN-1470 or should I let them be and warn the user about
potentially stale data?

Cheers
Dan


On Mon, Oct 3, 2011 at 3:09 AM, Manik Surtani ma...@jboss.org wrote:

 On 28 Sep 2011, at 10:56, Dan Berindei wrote:

 I'm not sure if the comment is valid though, since the old
 StateTransferManager had priority 55 and it also cleared the data
 container before applying the state from the coordinator. I'm not sure
 how preloading and state transfer are supposed to interact, maybe
 Manik can help clear this up?

 Hmm - this is interesting.  I think preloading should happen first, since
 the cache store may contain old data.
 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani
 Lead, Infinispan
 http://www.infinispan.org




___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Preloading from disk versus state transfer Re: ISPN-1384 - InboundInvocationHandlerImpl should wait for cache to be started? (not just defined)

2011-10-26 Thread Dan Berindei
On Mon, Oct 24, 2011 at 4:42 PM, Sanne Grinovero sa...@infinispan.org wrote:
 On 24 October 2011 12:58, Dan Berindei dan.berin...@gmail.com wrote:
 Hi Galder

 On Mon, Oct 24, 2011 at 1:46 PM, Galder Zamarreño gal...@redhat.com wrote:

 On Oct 24, 2011, at 12:04 PM, Dan Berindei wrote:

 ISPN-1470 (https://issues.jboss.org/browse/ISPN-1470) raises an
 interesting question: if the preloading happens before joining, the
 preloading code won't know anything about the consistent hash. It will
 load everything from the cache store, including the keys that are
 owned by other nodes.

 It's been defined to work that way:
 https://docs.jboss.org/author/display/ISPN/CacheLoaders

 Tbh, that will only happen in shared cache stores. In non-shared ones, 
 you'll only have data that belongs to that node.


 Not really... in distributed mode, every time the cache starts it will
 have another position on the hash wheel.
 That means even with a non-shared cache store, it's likely most of the
 stored keys will no longer be local.

 Actually I just noticed that you've fixed ISPN-1404, which looks like
 it would solves my problem when the cache is created by a HotRod
 server. I would like to extend it to work like this by default, e.g.
 by using the transport's nodeName as the seed.

 I think there is a check in place already so that the joiner won't
 push stale data from its cache store to the other nodes, but we should
 also discard the keys that don't map locally or we'll have stale data
 (since we don't have a way to check if those keys are stale and
 register to receive invalidations for those keys).

 +1, only for shared cache stores.


 What do you think, should I discard the non-local keys with the fix
 for ISPN-1470 or should I let them be and warn the user about
 potentially stale data?

 Discard only for shared cache stores.

 Cache configurations should be symmetrical, so if other nodes preload, 
 they'll preload only data local to them with your change.


 Discarding works fine from the correctness POV, but for performance
 it's not that great: we may do a lot of work to preload keys and have
 nothing to show for it at the end.

 Can't you just skip loading state and be happy with the state you
 receive from peers? More data will be lazily loaded.
 Applying of course only when you're not the only/first node in the
 grid, in which case you have to load.


Right, we could preload only on the first node. With a shared cache
store this should work great, we just have to start preloading after
we connect to the cluster and before we send the join request.

But I have trouble visualizing how a persistent (purgeOnStartup =
false) non-shared cache store should to work until we have some
validation mechanism like in
https://issues.jboss.org/browse/ISPN-1195. Should we even allow this
kind of setup?


 The only alternative I see is to be able to find the boundaries of
 keys you own, and change the CacheLoader API to load keys by the
 identified range - should work with multiple boundaries too for
 virtualnodes, but this is something that not all CacheLoaders will be
 able to implement, so it should be an optional API; for now I'd stick
 with the first option above as I don't see how we can be more
 efficient in loading the state from CacheLoaders than via JGroups.

 Sanne

 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Cache ops after view changes result in RehashInProgressException very easily

2011-10-26 Thread Dan Berindei
Hi Galder, sorry it took so long to reply.

On Mon, Oct 24, 2011 at 4:16 PM, Galder Zamarreño gal...@redhat.com wrote:
 Btw, forgot to attach the log:




 On Oct 24, 2011, at 3:13 PM, Galder Zamarreño wrote:

 Hi Dan,

 Re: http://goo.gl/TGwrP

 There's a few of this in the Hot Rod server+client testsuites. It's easy to 
 replicate it locally. Seems like cache operations right after a cache has 
 started are rather problematic.

 In local execution of HotRodReplicationTest, I was able to replicate the 
 issue when trying to test topology changes. Please find attached the log 
 file, but here're the interesting bits:

 1. A new view installation is being prepared with NodeA and NodeB:
 2011-10-24 14:36:09,046 4221  TRACE 
 [org.infinispan.cacheviews.CacheViewsManagerImpl] 
 (OOB-1,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) 
 ___hotRodTopologyCache: Preparing cache view CacheView{viewId=4, 
 members=[NodeA-63227, NodeB-15806]}, committed view is CacheView{viewId=3, 
 members=[NodeA-63227, NodeB-15806, NodeC-17654]}
 …
 2011-10-24 14:36:09,047 4222  DEBUG 
 [org.infinispan.statetransfer.StateTransferLockImpl] 
 (OOB-1,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) Blocking new 
 transactions
 2011-10-24 14:36:09,047 4222  TRACE 
 [org.infinispan.statetransfer.StateTransferLockImpl] 
 (OOB-1,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) Acquiring 
 exclusive state transfer shared lock, shared holders: 0
 2011-10-24 14:36:09,047 4222  TRACE 
 [org.infinispan.statetransfer.StateTransferLockImpl] 
 (OOB-1,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) Acquired state 
 transfer lock in exclusive mode

 2. The cluster coordinator discovers a view change and requests NodeA and 
 NodeB to remove NodeC from the topology view:
 2011-10-24 14:36:09,048 4223  TRACE 
 [org.infinispan.interceptors.InvocationContextInterceptor] 
 (OOB-3,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) Invoked with 
 command RemoveCommand{key=NodeC-17654, value=null, flags=null} and 
 InvocationContext [NonTxInvocationContext{flags=null}]

 3. NodeB has not yet finished installing the cache view, so that remove 
 times out:
 2011-10-24 14:36:09,049 4224  ERROR 
 [org.infinispan.interceptors.InvocationContextInterceptor] 
 (OOB-3,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) ISPN000136: 
 Execution error
 org.infinispan.distribution.RehashInProgressException: Timed out waiting for 
 the transaction lock

 A way to solve this is to avoid relying on cluster view changes, but instead 
 wait for the cache view to be installed, and then do the operations then. Is 
 there any way to wait till then?

 One way would be to have some CacheView installed callbacks or similar. This 
 could be a good option cos I could have a CacheView listener for the hot rod 
 topology cache whose callbacks I can check for isPre=false and then do the 
 cache ops safely.


Initially I was thinking of allowing multiple cache view listeners for
each cache and making StateTransferManager one of them but I decided
against it because I realized it needs a different interface than our
regular listeners. I know that it was only a matter of time until
someone needed it...

An alternative solution would be to retry all operations, like we do
with commits now, when we receive a RehashInProgressException
exception from the remote node. That's what I was planning to do first
as it helps in other use cases as well.

 Otherwise, code like this the one I used for keeping the Hot Rod topology is 
 gonna be racing against your cache view installation code.

 You seem to have some pieces in place for this, i.e. CacheViewListener, but 
 it seems only designed for internal core/ work.

 Any other suggestions?

 Cheers,
 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Cache ops after view changes result in RehashInProgressException very easily

2011-10-26 Thread Dan Berindei
I was gunning for beta3 but I don't think I'm going to make it.

On Wed, Oct 26, 2011 at 12:38 PM, Galder Zamarreño gal...@redhat.com wrote:

 On Oct 26, 2011, at 9:46 AM, Dan Berindei wrote:

 Hi Galder, sorry it took so long to reply.

 On Mon, Oct 24, 2011 at 4:16 PM, Galder Zamarreño gal...@redhat.com wrote:
 Btw, forgot to attach the log:




 On Oct 24, 2011, at 3:13 PM, Galder Zamarreño wrote:

 Hi Dan,

 Re: http://goo.gl/TGwrP

 There's a few of this in the Hot Rod server+client testsuites. It's easy 
 to replicate it locally. Seems like cache operations right after a cache 
 has started are rather problematic.

 In local execution of HotRodReplicationTest, I was able to replicate the 
 issue when trying to test topology changes. Please find attached the log 
 file, but here're the interesting bits:

 1. A new view installation is being prepared with NodeA and NodeB:
 2011-10-24 14:36:09,046 4221  TRACE 
 [org.infinispan.cacheviews.CacheViewsManagerImpl] 
 (OOB-1,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) 
 ___hotRodTopologyCache: Preparing cache view CacheView{viewId=4, 
 members=[NodeA-63227, NodeB-15806]}, committed view is CacheView{viewId=3, 
 members=[NodeA-63227, NodeB-15806, NodeC-17654]}
 …
 2011-10-24 14:36:09,047 4222  DEBUG 
 [org.infinispan.statetransfer.StateTransferLockImpl] 
 (OOB-1,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) Blocking new 
 transactions
 2011-10-24 14:36:09,047 4222  TRACE 
 [org.infinispan.statetransfer.StateTransferLockImpl] 
 (OOB-1,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) Acquiring 
 exclusive state transfer shared lock, shared holders: 0
 2011-10-24 14:36:09,047 4222  TRACE 
 [org.infinispan.statetransfer.StateTransferLockImpl] 
 (OOB-1,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) Acquired 
 state transfer lock in exclusive mode

 2. The cluster coordinator discovers a view change and requests NodeA and 
 NodeB to remove NodeC from the topology view:
 2011-10-24 14:36:09,048 4223  TRACE 
 [org.infinispan.interceptors.InvocationContextInterceptor] 
 (OOB-3,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) Invoked with 
 command RemoveCommand{key=NodeC-17654, value=null, flags=null} and 
 InvocationContext [NonTxInvocationContext{flags=null}]

 3. NodeB has not yet finished installing the cache view, so that remove 
 times out:
 2011-10-24 14:36:09,049 4224  ERROR 
 [org.infinispan.interceptors.InvocationContextInterceptor] 
 (OOB-3,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) ISPN000136: 
 Execution error
 org.infinispan.distribution.RehashInProgressException: Timed out waiting 
 for the transaction lock

 A way to solve this is to avoid relying on cluster view changes, but 
 instead wait for the cache view to be installed, and then do the 
 operations then. Is there any way to wait till then?

 One way would be to have some CacheView installed callbacks or similar. 
 This could be a good option cos I could have a CacheView listener for the 
 hot rod topology cache whose callbacks I can check for isPre=false and 
 then do the cache ops safely.


 Initially I was thinking of allowing multiple cache view listeners for
 each cache and making StateTransferManager one of them but I decided
 against it because I realized it needs a different interface than our
 regular listeners. I know that it was only a matter of time until
 someone needed it...

 An alternative solution would be to retry all operations, like we do
 with commits now, when we receive a RehashInProgressException
 exception from the remote node. That's what I was planning to do first
 as it helps in other use cases as well.

 Ok, do you have time to include this today ahead of the BETA3 release?

 I think this is a very important fix cos as you can see in the testsuite, 
 it's very easy to get this error with Hot Rod servers.


 Otherwise, code like this the one I used for keeping the Hot Rod topology 
 is gonna be racing against your cache view installation code.

 You seem to have some pieces in place for this, i.e. CacheViewListener, 
 but it seems only designed for internal core/ work.

 Any other suggestions?

 Cheers,
 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org

Re: [infinispan-dev] Time measurement and expiry

2011-10-30 Thread Dan Berindei
Cool link Elias, I read the paper linked in the javadoc and it's very
interesting.

Netty seems to use scheme 6 (a single array of lists with hashing and
without any overflow). I believe in general our timeouts will be much
longer so I suspect scheme 5 or scheme 7 should perform better for us.

Scheme 5 should be much better than 6 if most of the timeouts use the
same value, and I believe we fit right in that use case.
On the other hand I think we could save some memory if we implement
scheme 7 by not actually storing the higher level wheels and building
them on the fly by iterating the data container. The downside is that
we'll need periodic sweeps of the data container, but those can be on
a separate thread.

One scenario that we should keep in mind is the L1 cache, whose
entries I would guess are invalidated much more often than they expire
after their 1 hour lifetime. If the user stores only immortal entries
+ L1 entries in the cache we might end up with a ticker task that
never rarely does anything useful so we should minimize its impact.

It might also be worth it to implement our own list type to make
adding and removing timeouts faster. I see Netty uses
IdentityHashMap-backed sets for the bucket lists, but as Sanne found
out they aren't the fastest thing around. Using a doubly-linked list
and keeping a reference to the Node in the cache entry should make it
much faster.

Cheers
Dan


On Tue, Oct 25, 2011 at 8:02 PM, Elias Ross gen...@noderunner.net wrote:
 Rather than a heap, I recall Netty had a hash wheel timer[1]. Which should
 be better performing than a heap. You could get clever by having
 multiple heaps, segmented by orders of magnitude. For example wheel 1
 could have  1 minute, 2  1 hour etc. Netty uses a lot of short lived
 timers and performs well.

 [1] 
 http://docs.jboss.org/netty/3.1/api/org/jboss/netty/util/HashedWheelTimer.html

 Obviously this uses some memory, but you could probably optimize it
 somewhat to only work with Node instances.
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] ISPN-1543 Split CacheImpl into TransactionalCache and NonTransactionalCache

2011-11-22 Thread Dan Berindei
Any pros?

On Tue, Nov 22, 2011 at 2:11 AM, Mircea Markus mircea.mar...@jboss.com wrote:
 This is simple to implement and I don't see how it can cause any API 
 incompatibilities, but as it is in the very core I'd like to ask your opinion 
 - any cons for adding it into next release?
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Shipping logical address String in Hot Rod protocol v1.1?

2011-11-29 Thread Dan Berindei
On Tue, Nov 29, 2011 at 5:09 PM, Bela Ban b...@redhat.com wrote:


 On 11/29/11 3:50 PM, Galder Zamarreño wrote:

 On Nov 29, 2011, at 2:23 PM, Bela Ban wrote:



 On 11/29/11 2:10 PM, Galder Zamarreño wrote:
 Hi,

 We've been having a discussion this morning with regards to the Hot Rod 
 changes introduced in 5.1 with regards to hashing.

 When Hot Rod server is deployed in AS, in order to start correctly, it 
 requires the Hot Rod server to start before any other (clustered) caches 
 in AS. This is because any hashing can only be done once the hash has been 
 calculated on the hot rod endpoint.

 Although this can be fixed in a hacky way (have all caches configured to 
 start lazily and let the Hot Rod server start the topology cache and then 
 all defined caches, ugh), we're considering a slight change to Hot Rod 
 protocol v 1.1 
 (https://docs.jboss.org/author/display/ISPN/Hot+Rod+Protocol+-+Version+1.1)
  that would solve this problem.

 Instead of hashing on the hot rod endpoint address (host:port), we could 
 hash on JGroups's Address UTF-8 toString representation, which is the 
 logical address. The advantage here is any cache can calculate the hash on 
 this from the start, no need to wait for Hot Rod server to start. The 
 downside is that Hot Rod clients need to be aware of this UTF-8 string in 
 order to hash to the same thing, so it'd mean shipping this back to the 
 clients alongside the hot rod endpoint info.


 I have a few concerns, maybe not an issue with the way HotRod uses this,
 but I wanted to bring this up anyway:

 These are very valid concerns.

 - What if a logical name is null ? A user can do JChannel.setName(null);

 Hmmm, what's the JGroups address toString() representation in that case?


 The underlying UUID would be printed (not a nice sight !)...


Hmm, I sometimes see UUIDs in the test suite logs. This makes me
think, can we rely on the discovery protocol always giving us the
logical names of the other cluster members during/after the
viewAccepted() callback?



 - What if the logical name changes at runtime ? Certainly not
 recomended, but the APi allows this, or I should rather say, doesn't
 prevent it... :-(

 If the logical name changes, is a view change triggered?


 No. Maybe I should disallow changing a logical name after connect() has
 been called...


 How are other nodes gonna locate this node if it's logical address has 
 changed?


 Through the discovery process. Again, maybe I should just disallow this
 use case, as it IMO doesn't make any sense.


 Looks dangerous from a JGroups perspective, why is it allowed?


 I like to give users a lot of rope to hang themselves ... :-) ha ha

 OK, you convinced me: https://issues.jboss.org/browse/JGRP-1395, for
 3.0.2 and 3.1.0


+1


 - What if we have multiple members with the same logical name ?

 Can that happen?


 Yes


It's not very likely with standalone Infinispan, as we also add a
random number to the configured node name. We also have the option of
enforcing a unique node name by just failing to start when we detect
duplicates.

Don't know how AS sets the node names, but I'm pretty sure they also
ensure a unique node name.


 How can JGroups work correctly if multiple members have the same logical 
 name? How can unicasts be directed correctly in this case?


 Because the logical name is just a nice way to hide a UUID. JGroups
 never uses the logical address for anything but pretty-printing; instead
 the underlying UUID is used for addressing purposes.


 - At startup, we may not yet have the logical names of all members, as
 this is done as part of the discovery protocol…

 Hmmm, don't think that's important. What matters is that when node A starts, 
 after the channel is open, it's logical address is availble locally.


 OK , that'll be the case


 That's the same right now. When Hot Rod starts, it maps the running node's 
 JGroups address to its Hot Rod enpoint address.


 OK


 - Can't you hash on the address ? It'll be available as soon as
 connect() returns. Is that too late ?

 That would mean hashing a Java object which is not portable, hence why we 
 need a hash on something that's portable, i.e. a UTF-8 String.


 Can't you grab the UUID which is 2 longs and simply send 2 longs across
 the wire ? The UUID is guaranteed to be unique.


If we commit to sending a UUID to the HotRod clients it will be harder
to send something else in the future. One of the reasons we chose to
send a string was so that we could change the way we generate that
string without breaking the clients.

For example I'm very interested in starting nodes with a fixed hash
seed, so that after a restart the local cache store is still relevant
for this node. It would be also nice if the order of nodes on the hash
wheel wouldn't change between test runs.

As a compromise we could send a 32-bit hash to the clients and specify
the algorithm to generate virtual node hashes from it in the HotRod
protocol. That way we won't require 

Re: [infinispan-dev] Shipping logical address String in Hot Rod protocol v1.1?

2011-11-30 Thread Dan Berindei
On Wed, Nov 30, 2011 at 10:04 AM, Bela Ban b...@redhat.com wrote:


 On 11/30/11 8:59 AM, Dan Berindei wrote:

 Why don't you send the UUID as a (16 byte) string then ?


 Yeah, that would work. However, a UUID is not always a valid UTF-8
 string, so we should probably define it in the protocol as an array of
 bytes (without any meaning).


 Yes. We did something similar in JGroups, take a look at
 ENCRYPT.byteArrayToHexString().


Since HotRod is a binary protocol I meant to send the UUID as it is
(no encoding whatsoever).

I think sending the UUID as a hex-encoded string, a raw byte array, or
a 32-bit hash (since that's all we need on the client) are all valid
choices.

 --
 Bela Ban
 Lead JGroups (http://www.jgroups.org)
 JBoss / Red Hat
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Suspect Exceptions

2011-12-12 Thread Dan Berindei
Sanne, I think I changed this with my fix for ISPN-1493 and now the
StateTransferLockInterceptor retries the operation (at the moment, for
up to 3 times).

Dan


On Fri, Dec 9, 2011 at 4:56 PM, Slorg1 slo...@gmail.com wrote:
 Hi,

 On Fri, Dec 9, 2011 at 09:53, Sanne Grinovero sa...@infinispan.org wrote:
 Thanks Vladimir,
 I can do that as a short term solution, but do you think we could have
 Infinispan care for it in the future?
 I don't like to mandate wrapping all invocations to Infinispan in a
 try catch block looking for such errors, and maybe others.

 I am using 5.1.0 BETA 5 and when members are suspected I do not get
 the exception.

 Infinispan takes care of it for me.

 Regards,

 Slorg1

 --
 Please consider the environment - do you really need to print this email ?
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] boolean properties in config

2011-12-13 Thread Dan Berindei
On Tue, Dec 13, 2011 at 10:44 AM, Galder Zamarreño gal...@redhat.com wrote:

 On Dec 12, 2011, at 3:45 AM, Mircea Markus wrote:


 On 10 Dec 2011, at 10:04, Pete Muir wrote:

 https://issues.jboss.org/browse/ISPN-1474

 Currently we do a mix of:

 enableXXX()
 disableXXX()
 enabled(boolean b)
 XXX(boolean b)

 where XXX is something to enable or disable (e.g. purgeOnStartup). We also 
 the scatter the word use around in a very inconsistent fashion.
 this is a bit ugly indeed

 +1


 I would like to rationalise this, and would propose that every boolean has:

 enableXXX
 disableXXX
 xxxEnabled(boolean b)

 The former 2 are nice for hardcoded config, the latter is nice when you are 
 adapting one format to another (e.g. Paul).
 I think this would be much more readable. My only comment is that we allow 
 two ways of doing the same thing, which might be slightly confusing for the 
 users. Also the easiness of migration from old to new configuration is 
 important, but IMO not as important to make the new config simple.

 I agree with Mircea that having two ways of doing the same thing might 
 confuse users. If I had to choose I'd probably go for the latter option.


How about:

xxxEnabled()
xxxEnabled(boolean b)

However, I don't think enabled is going to work in every situation
(loader.sharedEnabled?), so we might want to keep the
xxx()/xxx(boolean b) form in some places.


 I would deprecate the usage of XXX(boolean b) and useXXX(boolean b) and 
 remove in 6.

 Thoughts?
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


[infinispan-dev] ISPN-1586 - inconsistent cache data in replication cluster with local (not shared) cache store

2011-12-14 Thread Dan Berindei
Hi guys

For a little background, see the discussion at
https://issues.jboss.org/browse/ISPN-1586

How do you feel about discarding the contents of the cache store on all
cache (virtual) nodes except the first to start?

Configuring purgeOnStartup=true is not ok in data grid scenarios, because
you'd lose all the data on restart. But loading entries from the cache
store of a node that started later is almost guaranteed to lead to stale
data. In some use cases that stale data might be acceptable, but in most
cases it probably isn't. So perhaps it makes sense to make this
configurable?

Cheers
Dan
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] 5.1.0.CR2 tomorrow, are you ready?

2011-12-16 Thread Dan Berindei
I've been working on some state transfer/virtual cache view
improvements: ISPN-1606. I'll have a pull request up in an hour.

Dan


On Thu, Dec 15, 2011 at 8:58 PM, Galder Zamarreño gal...@redhat.com wrote:

 On Dec 15, 2011, at 2:21 PM, Vladimir Blagojevic wrote:

 If someone can deal with my pull
 https://github.com/infinispan/infinispan/pull/722 I would appreciate it.
 It is very non-affecting to the rest of the codebase.

 Thx, just had a quick look and commented. It needs some tweaking IMO.


 Also, I have sent out a fix proposal for
 https://issues.jboss.org/browse/ISPN-1546 to Sanne for testing, I would
 like to squeeze that one in if he approves.


 On 11-12-15 9:24 AM, Pete Muir wrote:
 I'm hoping to replace out the parser, I have it mostly implemented, just 
 testing.

 On 15 Dec 2011, at 12:21, Galder Zamarreño wrote:

 Hi,

 We're planning to release CR2 tomorrow. How are things looking on your 
 side?

 I have the PFER stuff to commit, assuming Mircea's happy with it (see dev 
 list thread), and https://issues.jboss.org/browse/ISPN-1609

 Cheers,
 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] inefficient usage of CacheLoader with REPL

2011-12-21 Thread Dan Berindei
 Dan, thanks for looking. So it ignores the return values: wouldn't it
 be more efficient to ask the remote node to not serialize it at all in
 first place?
 I mean ReplicationInterceptor should send over both
 Flag.SKIP_CACHE_LOAD (*not* X_LOAD) and Flag.SKIP_REMOTE_LOOKUP
 on any write operation.


Flag.SKIP_REMOTE_LOOKUP shouldn't be necessary, the ResponseGenerator
on the remote node checks the command ID before serializing the
response and sends a null instead if the command is not one of those
listed in DefaultResponseGenerator.requiresResponse (for replicated
and invalidation caches).

The ResponseGenerator does seem a bit magic, so perhaps it would be
better to use Flag.SKIP_REMOTE_LOOKUP instead and maybe add a generic
Flag.IGNORE_REMOTE_RESPONSE if there are other commands that would
benefit from it. Definitely something for 5.2 :)

Cheers
Dan
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Infinispan 5.1.0.CR3 release

2012-01-09 Thread Dan Berindei
I still have some state transfer failures, I was hoping to fix them by
this morning but I'm still having troubles so I might finish very late
today.

Dan


On Mon, Jan 9, 2012 at 1:39 PM, Galder Zamarreño gal...@redhat.com wrote:
 Not on my side. I still need to check what 2LC looks with the fixes it needed 
 which I'll do that today (w/ snapshot) or tomorrow (w/ CR3)

 On Jan 9, 2012, at 12:35 PM, Manik Surtani wrote:

 Yes please.  Everyone, shall we defer our status update today for tomorrow, 
 and instead focus on getting this release out today?

 Anything you feel that is important that *won't* make the cut?

 On 9 Jan 2012, at 10:29, Galder Zamarreño wrote:

 Guys, how does CR3 today sound?

 On Jan 4, 2012, at 5:24 PM, Galder Zamarreño wrote:

 I'm on holiday on Friday (instead of Monday) so I'm hoping to finish all 
 by CR3 stuff by tomorrow.

 So, if stuff is ready, maybe someone else can release on Friday instead of 
 me?

 Cheers,

 On Jan 4, 2012, at 4:56 PM, Manik Surtani wrote:

 I'm happy with one this week even, if we're ready in time.

 On 4 Jan 2012, at 15:53, Galder Zamarreño wrote:

 Btw, I had originally planned to release CR3 on the 12th but based on 
 Tristan's request I'd like to do it earlier, probably on 9th or 10th 
 (latest).

 How does that sound?

 Let's get cracking with all those remaining issues! :)

 On Jan 4, 2012, at 4:41 PM, Tristan Tarrant wrote:

 On 01/04/2012 04:39 PM, Galder Zamarreño wrote:
 We're in CR stages though so what's the point of releasing a CR which 
 actually is not really a CR? ;)

 I can see the point of faster, more frequent releases in BETA/ALPHA 
 stages because we can push over things, but doesn't make sense doing 
 that in CR.

 Galder, our CRs are not real CRs anyway, so it's only a naming thing.

 Tristan
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org




 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org




 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] IgnoreExtraResponsesValidityFilter

2012-01-14 Thread Dan Berindei
Good catch Manik! I just copied the approach from
ClusteredGetResponseFilter and I didn't think too much about
performance issues.

Bela, the BitSet approach sounds a little better than a simple counter
from a debugging perspective. But I think we're also missing some
synchronization at the moment, and with the counter approach could be
made thread-safe with minimal overhead by using an AtomicInteger. I
assume using a counter only is safe (i.e. RspFilter.isAcceptable()
will never be called twice for the same target).

ClusteredGetResponseFilter could probably use the same treatment,
except the targets list is quite small so we can search in the list
directly instead of creating a HashSet with the same contents.

Cheers
Dan


On Sat, Jan 14, 2012 at 4:06 PM, Bela Ban b...@redhat.com wrote:
 Another implementation could take the current view, and create a BitSet
 with the size being the length of the view, and every bit corresponds to
 the position of a member in the view, e.g.:
 V1={A,B,C,D,E,F}, bitset={0,0,1,0,1,1} means that responses have been
 received from C, E and F.

 On 1/13/12 8:20 PM, Manik Surtani wrote:
 Looking at IgnoreExtraResponsesValidityFilter - this seems to be a 
 scalability issue!  It seems to copy a set of every address in the cluster 
 and gradually remove entries as responses come in.  Isn't this a scalability 
 issue?  Since - assuming a large cluster - for every prepare command, we 
 create a collection, copy into it the entire list of addresses (think 
 hundreds of nodes x hundreds of concurrent threads) only to count down on 
 the responses.  I'm almost certain there is a better way to do this!  :)  
 Maybe even maintain a shared list of members (updated whenever there is a 
 view change) to test for responses from non-members, a counter, and assume 
 that members don't respond to the same request more than once?

 Cheers
 Manik
 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org

 --
 Bela Ban
 Lead JGroups (http://www.jgroups.org)
 JBoss / Red Hat
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] moving asyncMarshalling to jgroups

2012-01-19 Thread Dan Berindei
On Thu, Jan 19, 2012 at 7:46 PM, Mircea Markus mircea.mar...@jboss.com wrote:

 On 19 Jan 2012, at 17:36, Galder Zamarreño wrote:



 On Jan 19, 2012, at 3:43 PM, Bela Ban wrote:

 This may not give you any performance increase:


 #1 In my experience, serialization is way faster than de-serialization.

 Unless you're doing something fancy in your serializer


 No. I think Mircea didn't explain this very well. What really happens here
 is that when asyncMarshalling is turned on (the name is confusing...), async
 transport sending is activated.  What does this mean?

 When the request needs to be passed onto JGroups, this is done in a separate
 thread, which indirectly, results in marshalling happening in a different
 thread.

 This returns faster because we return the control earlier rather than
 waiting for JGroups to do his thing of marshalling and sending the command
 and not waiting for response.

 See the use of RpcManagerImpl.asyncExecutor for more info.

 Thanks for the clarification Galder.
 So basically with asyncMarshalling=true (this only works for async x-casts):
 - user thread(1) passes control to marshalling thread(2). At this point user
 thread returns
 - marshalling thread serialises and builds the Message object then passed
 control to jgroups thread(3) that sends the response async


Are you sure about this Mircea? AFAIK the message is always sent by
JGroups from our thread (either the user thread or the async transport
thread). The only difference between sync (GET_ALL) and async calls
(GET_NONE) at the JGroups level is that with GET_NONE that thread
doesn't wait for the response from the target nodes.

Cheers
Dan

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Keeping track of locked nodes

2012-01-20 Thread Dan Berindei
In this particular case it just means that the commit is sync even if
it's configured to be async.

But there are more places where we check the remote lock nodes list,
e.g. BaseRpcInterceptor.shouldInvokeRemotely or
AbstractEnlistmentAdapter - which, incidentally, could probably settle
with a hasRemoteLocks boolean flag instead of the nodes collection.

TransactionXaAdapter.forgetSuccessfullyCompletedTransaction does use
it properly when recovery is enabled - if we didn't keep the
collection around it would have to compute it again from the list of
keys.

The same with PessimisticLockingInterceptor.releaseLocksOnFailureBeforePrepare,
but there we're on thefailure path already so recomputing the address
set wouldn't hurt that much.

I may have missed others...

Cheers
Dan


On Fri, Jan 20, 2012 at 8:52 AM, Manik Surtani ma...@jboss.org wrote:
 Well, a view change may not mean that the nodes prepared on have changed.  
 But still, there would almost certainly be a better way to do this than a 
 collection of addresses.  So the goal is to determine whether the set of 
 nodes the prepare has been sent to and the current set of affected nodes at 
 the time of commit has changed?  And what are the consequences of getting 
 this (pessimistically) wrong, just that the prepare gets repeated on some 
 nodes?


 On 20 Jan 2012, at 03:54, Mircea Markus wrote:

 On 19 Jan 2012, at 19:22, Sanne Grinovero wrote:
 I just noticed that org.infinispan.transaction.LocalTransaction is
 keeping track of Addresses on which locks where acquired.
 That's surprising me .. why should it ever be interested in the
 specific Address? I'd expect it to be able to figure that out when
 needed, especially since the Address owning the lock might change over
 time I don't understand to track for a specific node.
 This information is required at least here[1], but not sure wether we cannot 
 rely on the viewId which is now associated with every transaction to replace 
 that logic.
 [1] http://bit.ly/wgMIHH


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org




 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] DIST.retrieveFromRemoteSource

2012-01-25 Thread Dan Berindei
Hi Sanne

On Wed, Jan 25, 2012 at 1:22 AM, Sanne Grinovero sa...@infinispan.org wrote:
 Hello,
 in the method:
 org.infinispan.distribution.DistributionManagerImpl.retrieveFromRemoteSource(Object,
 InvocationContext, boolean)

 we have:

      ListAddress targets = locate(key);
      // if any of the recipients has left the cluster since the
 command was issued, just don't wait for its response
      targets.retainAll(rpcManager.getTransport().getMembers());

 But then then we use ResponseMode.WAIT_FOR_VALID_RESPONSE, which means
 we're not going to wait for all responses anyway, and I think we might
 assume to get a reply by a node which actually is in the cluster.

 So the retainAll method is unneeded and can be removed? I'm wondering,
 because it's not safe anyway, actually it seems very unlikely to me
 that just between a locate(key) and the retainAll the view is being
 changed, so not something we should be relying on anyway.
 I'd rather assume that such a get method might be checked and
 eventually dropped by the receiver.


The locate method will return a list of owners based on the
committed cache view, so there is a non-zero probability that one of
the owners has already left.

If I remember correctly, I added the retainAll call because otherwise
ClusteredGetResponseFilter.needMoreResponses() would keep returning
true if one of the targets left the cluster. Coupled with the fact
that null responses are not considered valid (unless *all* responses
are null), this meant that a remote get for a key that doesn't exist
would throw a TimeoutException after 15 seconds instead of returning
null immediately.

We could revisit the decision to make null responses invalid, and then
as long as there is still one of the old owners left in the cluster
you'll get the null result immediately. You may still get an exception
if all the old owners left the cluster, but I'm not sure. I wish I had
added a test for this...

We may also be able to add a workaround in FutureCollator as well -
just remember that we use the same FutureCollator for writes in REPL
mode so it needs to work with GET_ALL as well as with GET_FIRST.

Slightly related, I wonder if Manik's comment is still true:

if at all possible, try not to use JGroups' ANYCAST for now.
Multiple (parallel) UNICASTs are much faster.)

Intuitively it shouldn't be true, unicasts+FutureCollator do basically
the same thing as anycast+GroupRequest.

Cheers
Dan


 Cheers,
 Sanne
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] default value for virtualNodes

2012-01-27 Thread Dan Berindei
On Fri, Jan 27, 2012 at 2:35 PM, Mircea Markus mircea.mar...@jboss.com wrote:
 I've created a JIRA to track this: https://issues.jboss.org/browse/ISPN-1801

 I understand everybody has to have the exact same number of vnodes for

 reads and writes to hit the correct node, right ?

 Yes.

 That's true, but it is not a good thing: numVirtNodes should be proportional
 with the node's capacity, i.e. more powerful machines in the cluster should
 have assigned more virtual nodes.
 This way we can better control the load. A node would need to send its
 configured numVirtualNodes when joining in order to support this, but that's
 a thing we already do for  TACH.


We should use a different mechanism than the TopologyAwareUUID we use
for TACH, because the address is sent with every command. The capacity
instead should be fairly static. We may want to make it changeable at
runtime, but it will take a state transfer to propagate that info to
all the members of the cluster (because the nodes' CHs need to stay in
sync).

In fact, I can imagine users wanting to balance key ownership between
machines/racks/sites with TACH, but without actually using RELAY - the
TopologyAwareUUID is just an overhead for them.

Cheers
Dan

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] default value for virtualNodes

2012-01-27 Thread Dan Berindei
On Fri, Jan 27, 2012 at 2:53 PM, Mircea Markus mircea.mar...@jboss.com wrote:
 That's true, but it is not a good thing: numVirtNodes should be proportional
 with the node's capacity, i.e. more powerful machines in the cluster should
 have assigned more virtual nodes.
 This way we can better control the load. A node would need to send its
 configured numVirtualNodes when joining in order to support this, but that's
 a thing we already do for  TACH.


 We should use a different mechanism than the TopologyAwareUUID we use
 for TACH, because the address is sent with every command.
 so every command sends cluster, rack and machine info? That's sounds a bit 
 redundant. Can't we just send them once with the JOIN request?

When RELAY is enabled, it actually needs the topology info in order to
relay messages between sites.
I agree that the topology info should never change, but RELAY requires
it for now so we can't avoid it (in the general case).

Cheers
Dan

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] very interesting performance results

2012-01-27 Thread Dan Berindei
On Fri, Jan 27, 2012 at 3:43 PM, Mircea Markus mircea.mar...@jboss.com wrote:
 On 27 Jan 2012, at 13:31, Sanne Grinovero wrote:
 My experiments where using the default JVM settings regarding compile
 settings, with these others:

 -Xmx2G -Xms2G -XX:MaxPermSize=128M -XX:+HeapDumpOnOutOfMemoryError
 -Xss512k -XX:HeapDumpPath=/tmp/java_heap
 -Djava.net.preferIPv4Stack=true -Djgroups.bind_addr=127.0.0.1
 -Dlog4j.configuration=file:/opt/log4j.xml -XX:+PrintCompilation
 -Xbatch -server -XX:+UseCompressedOops -XX:+UseLargePages
 -XX:LargePageSizeInBytes=2m -XX:+AlwaysPreTouch

 And it's with these options that after 30 minutes of full-stress it's
 still not finished warming up all of Infinispan+JGroups code.
 After that I stated I was going to *experiment* with
 -XX:CompileThreshold=10, to see if I could get it to compile in a
 shorter time, just to save me from waiting too long in performance
 tests. It doesn't seem to matter much, so I'm reverting it back to the
 above values (default for this parameter for my VM is 1).

 That's surprising, I'd say that in 30 mins of invocations all the critical 
 paths are touched much more many times than 1. E.g. the number of 
 reads/sec is in thousands (20k on the cluster lab). Might be that this param 
 is ignored or it collides with other -XX ?


PrintCompilation tells you which methods are compiled, but it doesn't
tell you which methods were inlined with it. So something like this
can (and probably does) happen:

1. ConcurrentSkipListMap.doPut is compiled
2. UNICAST2.down() (assuming JGroups 3.0.3.Final) calls
AckSenderWindow.add() - ConcurrentSkipListMap.put - doPut. This gets
compiled, and everything is inlined in it.
2. The conditions where ConcurrentSkipListMap.put is called change
dramatically - so the initial optimizations are no longer valid.
2. Eventually the timers and everything else that's using
ConcurrentSkipListMap call put() 1 times and
ConcurrentSkipListMap.doPut is compiled again.

AFAIK oprofile will not report any inlined methods, so if
ConcurrentSkipListMap appears in the report it means that it still
called a fair amount of times. On the other hand, oprofile also
doesn't report interpreted methods - so if it appears as a separate
method in the oprofile report than it means it is compiled.

You should also try running the test without -Xbatch, apparently it's
only good for debugging the JVM:
http://stackoverflow.com/questions/3369791/java-vm-tuning-xbatch-and-xcomp

It shouldn't change what methods get compiled, but compilation should
be less expensive without it.

Cheers
Dan

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] DIST.retrieveFromRemoteSource

2012-01-27 Thread Dan Berindei
Manik, Bela, I think we send the requests sequentially as well. In
ReplicationTask.call:

   for (Address a : targets) {
  NotifyingFutureObject f =
sendMessageWithFuture(constructMessage(buf, a), opts);
  futureCollator.watchFuture(f, a);
   }


In MessageDispatcher.sendMessageWithFuture:

UnicastRequestT req=new UnicastRequestT(msg, corr, dest, options);
req.setBlockForResults(false);
req.execute();


Did we use to send each request on a separate thread?


Cheers
Dan


On Fri, Jan 27, 2012 at 1:21 PM, Bela Ban b...@redhat.com wrote:
 yes.

 On 1/27/12 12:13 PM, Manik Surtani wrote:

 On 25 Jan 2012, at 09:42, Bela Ban wrote:

 No, parallel unicasts will be faster, as an anycast to A,B,C sends the
 unicasts sequentially

 Is this still the case in JG 3.x?


 --
 Bela Ban
 Lead JGroups (http://www.jgroups.org)
 JBoss / Red Hat
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] The need for a 5.1.1

2012-01-27 Thread Dan Berindei
On Fri, Jan 27, 2012 at 5:31 PM, Bela Ban b...@redhat.com wrote:


 On 1/27/12 4:26 PM, Mircea Markus wrote:

 On 27 Jan 2012, at 15:08, Bela Ban wrote:

 Build the JGroups JAR with ./build.sh jar, *not* via maven !

 I attached the JAR for you.
 Thanks!
 JGroups *is* and *will remain* JAR less ! :-)

 Sorry for loosing the faith :)
 Might make sense to have the mvn install work as well though, I think people 
 would expect it to behave correctly when they see a pom.xml.

 Are you volunteering ? I'd be happy to integrate your changes ! :-)


Tests probably don't work, but it's enough to get it building:
https://github.com/belaban/JGroups/pull/25

Cheers
Dan
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] default value for virtualNodes

2012-01-30 Thread Dan Berindei
Manik, I'm assigning ISPN-1801 to myself - I need to add my key
distribution test and the results anyway.

Cheers
Dan


On Mon, Jan 30, 2012 at 1:17 PM, Manik Surtani ma...@jboss.org wrote:

 On 29 Jan 2012, at 14:57, Galder Zamarreño wrote:

 To reiterate what I said in another thread, the memory effects of virtual
 nodes on on Hot Rod clients is none since version 1.1 of the protocol
 (included in 5.1).

 I enhanced the protocol so that clients would generate virtual node hashes
 and so avoid sending them over the wire.


 Yes, I remember.  Perfect.  :)

 So for https://issues.jboss.org/browse/ISPN-1801 I'll set the default to 48.
  Any objections?

 Cheers
 Manik

 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org




 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] JBoss Libra

2012-01-31 Thread Dan Berindei
On Tue, Jan 31, 2012 at 4:49 PM, Bela Ban b...@redhat.com wrote:
 I don't know, but I don't see why not.

 If you look at java.lang.{Memory,GarbageCollector,MemoryPool}, there are
 a lot of values you can look at, including the details on eden and old.


I just looked at G1 with JConsole, it does provide memory usage
information in (java.lang:type=MemoryPool, name={G1 Young Gen, G1
Survivor, G1 Old Gen}).Usage. However, the locations are not standard,
so we'd have to keep a bunch of rules about where to look for each
garbage collector.

Plus I'm not sure what looking at each generation separately would buy
us over looking only at Runtime.freeMemory().

 You can even register for notifications when free memory drops below a
 certain threshold, but I've never tried this out and I heard some impls
 don't provide this...


The biggest problem I think with looking at the amount of used/free
memory is used memory includes dead objects. So you'll get a free
memory threshold notification and start evicting entries, but you
won't know how much memory you've freed (or even if you needed to
evict any entries in the first place) without triggering an extremely
expensive full GC.

I second Mircea's idea of using the serialized data size, that's the
only reliable data we've got.

Cheers
Dan



 On 1/31/12 3:32 PM, Tristan Tarrant wrote:
 On 01/31/2012 03:32 PM, Bela Ban wrote:
 IIRC you can look at the size of the young and old generation via JMX,
 and there you can see how much memory has accumulated.
 Does that work when using G1 ?

 Tristan
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Bela Ban
 Lead JGroups (http://www.jgroups.org)
 JBoss / Red Hat
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] again: no physical address

2012-01-31 Thread Dan Berindei
Hi Bela

I guess it's pretty clear now... In Sanne's thread dump the main
thread is blocked in a cache.put() call after the cluster has
supposedly already formed:

org.infinispan.benchmark.Transactional.main() prio=10
tid=0x7ff4045de000 nid=0x7c92 in Object.wait()
[0x7ff40919d000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 0x0007f61997d0 (a
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$FutureCollator)
at 
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$FutureCollator.getResponseList(CommandAwareRpcDispatcher.java:372)
...
at 
org.infinispan.distribution.DistributionManagerImpl.retrieveFromRemoteSource(DistributionManagerImpl.java:169)
...
at org.infinispan.CacheSupport.put(CacheSupport.java:52)
at org.infinispan.benchmark.Transactional.start(Transactional.java:110)
at org.infinispan.benchmark.Transactional.main(Transactional.java:70)

State transfer was disabled, so during the cluster startup the nodes
only had to communicate with the coordinator and not between them. The
put command had to get the old value from another node, so it needed
the physical address and had to block until PING would retrieve it.

Does PING use RSVP or does it wait for the normal STABLE timeout for
retransmission? Note that everything is blocked at this point, we
won't send another message in the entire cluster until we got the
physical address.

I'm sure you've already considered it before, but why not make the
physical addresses a part of the view installation message? This
should ensure that every node can communicate with every other node by
the time the view is installed.


I'm also not sure what to make of these lines:

 [org.jgroups.protocols.UDP] sanne-55119: no physical address for
 sanne-53650, dropping message
 [org.jgroups.protocols.pbcast.GMS] JOIN(sanne-55119) sent to
 sanne-53650 timed out (after 3000 ms), retrying

It appears that sanne-55119 knows the logical name of sanne-53650, and
the fact that it's coordinator, but not its physical address.
Shouldn't all of this information have arrived at the same time?


Cheers
Dan


On Tue, Jan 31, 2012 at 4:31 PM, Bela Ban b...@redhat.com wrote:
 This happens every now and then, when multiple nodes join at the same
 time, on the same host and PING has a small num_initial_mbrs.

 Since 2.8, the identity of a member is not an IP address:port anymore,
 but a UUID. The UUID has to be mapped to an IP address (and port), and
 every member maintains a table of UUIDs/IP addresses. This table is
 populated at startup, but the shipping of the IP address/UUID
 association is unreliable (in the case of UDP), so packets do get
 dropped when there are traffic spikes, like concurrent startup, or when
 the high CPU usage slows down things.

 If we need to send a unicast message to P, and the table doesn't have a
 mapping for P, PING multicasts a discovery request, and drops the
 message. Every member responds with the IP address of P, which is then
 added to the table. The next time the message is sent (through
 retransmission), P's IP address will be available, and the unicast send
 should succeed.

 Of course, if the multicast or unicast response is dropped too, we'll
 run this protocol again... and again ... and again, until we finally
 have a valid IP address for P.


 On 1/31/12 11:29 AM, Manik Surtani wrote:
 I have sporadically seen this before when running some perf tests as well … 
 curious to know what's up.

 On 30 Jan 2012, at 17:45, Sanne Grinovero wrote:

 Hi Bela,
 this is the same error we where having in Boston when preparing the
 Infinispan nodes for some of the demos. So I didn't see it for a long
 time, but today it returned especially to add a special twist to my
 performance tests.

 Dan,
 when this happened it looked like I had a deadlock: the benchmark is
 not making any more progress, it looks like they are all waiting for
 answers. JConsole didn't detect a deadlock, and unfortunately I'm not
 having more logs than this from nor JGroups nor Infinispan (since it
 was supposed to be a performance test!).

 I'm attaching a threaddump in case it interests you, but I hope not:
 this is a DIST test with 12 nodes (in the same VM from this dump). I
 didn't have time to inspect it myself as I have to run, and I think
 the interesting news here is with the no physical address

 ideas?

 [org.jboss.logging] Logging Provider: org.jboss.logging.Log4jLoggerProvider
 [org.jgroups.protocols.UDP] sanne-55119: no physical address for
 sanne-53650, dropping message
 [org.jgroups.protocols.pbcast.GMS] JOIN(sanne-55119) sent to
 sanne-53650 timed out (after 3000 ms), retrying
 [org.jgroups.protocols.pbcast.GMS] sanne-55119 already present;
 returning existing view [sanne-53650|5] [sanne-53650, sanne-49978,
 sanne-27401, sanne-4741, sanne-29196, sanne-55119]
 [org.jgroups.protocols.UDP] 

Re: [infinispan-dev] DIST.retrieveFromRemoteSource

2012-01-31 Thread Dan Berindei
It's true, but then JGroups' GroupRequest does exactly the same thing...

socket.send() takes some time too, I thought sending the request in
parallel would mean calling socket.send() on a separate thread for
each recipient.

Cheers
Dan


On Fri, Jan 27, 2012 at 6:41 PM, Manik Surtani ma...@jboss.org wrote:
 Doesn't setBlockForResults(false) mean that we're not waiting on a response, 
 and can proceed to the next message to the next recipient?

 On 27 Jan 2012, at 16:34, Dan Berindei wrote:

 Manik, Bela, I think we send the requests sequentially as well. In
 ReplicationTask.call:

               for (Address a : targets) {
                  NotifyingFutureObject f =
 sendMessageWithFuture(constructMessage(buf, a), opts);
                  futureCollator.watchFuture(f, a);
               }


 In MessageDispatcher.sendMessageWithFuture:

        UnicastRequestT req=new UnicastRequestT(msg, corr, dest, options);
        req.setBlockForResults(false);
        req.execute();


 Did we use to send each request on a separate thread?


 Cheers
 Dan


 On Fri, Jan 27, 2012 at 1:21 PM, Bela Ban b...@redhat.com wrote:
 yes.

 On 1/27/12 12:13 PM, Manik Surtani wrote:

 On 25 Jan 2012, at 09:42, Bela Ban wrote:

 No, parallel unicasts will be faster, as an anycast to A,B,C sends the
 unicasts sequentially

 Is this still the case in JG 3.x?


 --
 Bela Ban
 Lead JGroups (http://www.jgroups.org)
 JBoss / Red Hat
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org




 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] again: no physical address

2012-02-01 Thread Dan Berindei
On Wed, Feb 1, 2012 at 9:48 AM, Bela Ban b...@redhat.com wrote:


 On 1/31/12 10:55 PM, Dan Berindei wrote:
 Hi Bela

 I guess it's pretty clear now... In Sanne's thread dump the main
 thread is blocked in a cache.put() call after the cluster has
 supposedly already formed:

 org.infinispan.benchmark.Transactional.main() prio=10
 tid=0x7ff4045de000 nid=0x7c92 in Object.wait()
 [0x7ff40919d000]
     java.lang.Thread.State: TIMED_WAITING (on object monitor)
          at java.lang.Object.wait(Native Method)
          - waiting on0x0007f61997d0  (a
 org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$FutureCollator)
          at 
 org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$FutureCollator.getResponseList(CommandAwareRpcDispatcher.java:372)
          ...
          at 
 org.infinispan.distribution.DistributionManagerImpl.retrieveFromRemoteSource(DistributionManagerImpl.java:169)
          ...
          at org.infinispan.CacheSupport.put(CacheSupport.java:52)
          at 
 org.infinispan.benchmark.Transactional.start(Transactional.java:110)
          at 
 org.infinispan.benchmark.Transactional.main(Transactional.java:70)

 State transfer was disabled, so during the cluster startup the nodes
 only had to communicate with the coordinator and not between them. The
 put command had to get the old value from another node, so it needed
 the physical address and had to block until PING would retrieve it.


 That's not the way it works; at startup of F, it sends its IP address
 with the discovery request. Everybody returns its IP address with the
 discovery response, so even though we have F only talking to A (the
 coordinator) initially, F will also know the IP addresses of A,B,C,D and E.


Ok, I stand corrected... since we start all the nodes on the same
thread, each of them should reply to the discovery request of the next
nodes.

However, num_initial_members was set to 3 (the Infinispan default).
Could that make PING not wait for all the responses? If it's like
that, then I suggest we set a (much) higher num_initial_members and a
lower timeout in the default configuration.



 Does PING use RSVP


 No: (1) I don;'t want a dependency of Discovery on RSVP and (2) the
 discovery is unreliable; discovery requests or responses can get dropped.


Right, I keep forgetting that every protocol is optional!



 or does it wait for the normal STABLE timeout for retransmission?


  Note that everything is blocked at this point, we
 won't send another message in the entire cluster until we got the physical 
 address.


 As I said; this is an exceptional case, probably caused by Sanne
 starting 12 channels inside the same JVM, at the same time, therefore
 causing a traffic spike, which results in dropped discovery requests or
 responses.


Bela, we create the caches on a single thread, so we never have more
than one node joining at the same time.
At most we could have some extra activity if one node can't join the
existing cluster and starts a separate partition, but hardly enough to
cause congestion.


 After than, when F wants to talk to C, it asks the cluster for C's IP
 address, and that should be a few ms at most.


Ok, so when F wanted to send the ClusteredGetCommand request to C,
PING got the physical address right away. But the ClusteredGetCommand
had to wait for STABLE to kick in and for C to ask for retransmission
(because we didn't send any other messages).

Maybe *we* should use RSVP for our ClusteredGetCommands, since those
can never block... Actually, we don't want to retransmit the request
if we already got a response from another node, so it would be best if
we could ask for retransmission of a particular request explicitly ;-)

I wonder if we could also decrease desired_avg_gossip and
stability_delay in STABLE. After all, an extra STABLE round can't slow
us when we're not doing anything, and when we are busy we're going to
hit the max_bytes limit much sooner than the desired_avg_gossip time
limit anyway.



 I'm sure you've already considered it before, but why not make the
 physical addresses a part of the view installation message? This
 should ensure that every node can communicate with every other node by
 the time the view is installed.


 There's a few reasons:

 - I don't want to make GMS dependent on logical addresses. GMS is
 completely independent and shouldn't know about physical addresses
 - At the time GMS kicks in, it's already too late. Remember, F needs to
 send a unicast JOIN request to A, but at this point it doesn't yet know
 A's address
 - MERGE{2,3} also use discovery to detect sub-partitions to be merged,
 so discovery needs to be a separate piece of functionality
 - A View is already big as it is, and I've managed to reduce its size
 even more, but adding physical addresses would blow up the size of View
 even more, especially in large clusters


Thanks for the explanation.


 I'm also not sure what to make of these lines

Re: [infinispan-dev] Proposal: ISPN-1394 Manual rehashing in 5.2

2012-02-01 Thread Dan Berindei
Bela, you're right, this is essentially what we talked about in Lisbon:
https://community.jboss.org/wiki/AsymmetricCachesAndManualRehashingDesign

For joins I actually started working on a policy of coalescing joins
that happen one after the other in a short time interval. The current
implementation is very primitive, as I shifted focus to stability, but
it does coalesce joins 1 second after another join started (or while
that join is still running).

I don't quite agree with Sanne's assessment that it's fine for
getCache() to block for 5 minutes until the administrator allows the
new node to join. We should modify startCaches() instead to signal to
the coordinator that we are ready to receive data for one or all of
the defined caches, and wait with a customizable time limit until the
caches have properly joined the cluster.

The getCache() timeout should not be increased at all. Instead I would
propose that getCache() returns a functional cache immediately, even
if the cache didn't receive any data, and it works solely as an L1
cache until the administrator allows it to join. I'd even make it
possible to designate a cache as an L1-only cache, so it's never an
owner for any key.


For leaves, the main problem is that every node has to compute the
same primary owner for a key, at all times. So we need a 2PC cache
view installation immediately after any leave to ensure that every
node determines the primary owner in the same way - we can't coalesce
or postpone leaves.

For 5.2 I will try to decouple the cache view installation from the
state transfer, so in theory we will be able to coalesce/postpone the
state transfer for leaves as well
(https://issues.jboss.org/browse/ISPN-1827). I'm kind of need it for
non-blocking state transfer, because with the current implementation a
leave forces us to cancel any state transfer in progress and restart
with the updated cache view - a state transfer rollback will be very
expensive with NBST.


Erik does raise a valid point - with TACH, if we bring up a node with
a different siteId, then it will be an owner for all the keys in the
cache. That node probably isn't provisioned to hold all the keys, so
it would very likely run out of memory or evict much of the data. I
guess that makes it a 5.2 issue?

Shutting down a site should be possible even with what we have now -
just insert a DISCARD protocol in the JGroups stack of all the nodes
that are shutting down, and when FD finally times out on the nodes in
the surviving datacenter they won't have any state transfer to do
(although it may cause a few failed state transfer attempts). We could
make it simpler though.


Cheers
Dan


On Tue, Jan 31, 2012 at 6:21 PM, Erik Salter an1...@hotmail.com wrote:
 ...such as bringing up a backup data center.

 -Original Message-
 From: infinispan-dev-boun...@lists.jboss.org
 [mailto:infinispan-dev-boun...@lists.jboss.org] On Behalf Of Bela Ban
 Sent: Tuesday, January 31, 2012 11:18 AM
 To: infinispan-dev@lists.jboss.org
 Subject: Re: [infinispan-dev] Proposal: ISPN-1394 Manual rehashing in 5.2

 I cannot volunteer either, but I find it important to be done in 5.2 !

 Unless rehashing works flawlessly with a large number of nodes joining
 at the same time, I think manual rehashing is crucial...



 On 1/31/12 5:13 PM, Sanne Grinovero wrote:
 On 31 January 2012 16:06, Bela Banb...@redhat.com  wrote:
 This is essentially what I suggested at the Lisbon meeting, right ?

 Yes!

 I think Dan had a design wiki on this somewhere...

 Just rising it here as it was moved to 6.0, while I think it deserves
 a dedicated thread to better think about it. If it's not hard, I think
 it should be done sooner.
 But while I started the thread to wake up the brilliant minds, I can't
 volunteer for this to make it happen.

 Sanne



 On 1/31/12 4:53 PM, Sanne Grinovero wrote:
 I think this is an important feature to have soon;

 My understanding of it:

 We default with the feature off, and newly discovered nodes are
 added/removed as usual. With a JMX operatable switch, one can disable
 this:

 If a remote node is joining the JGroups view, but rehash is off: it
 will be added to a to-be-installed view, but this won't be installed
 until rehash is enabled again. This gives time to add more changes
 before starting the rehash, and would help a lot to start larger
 clusters.

 If the [self] node is booting and joining a cluster with manual rehash
 off, the start process and any getCache() invocation should block and
 wait for it to be enabled. This would need of course to override the
 usually low timeouts.

 When a node is suspected it's a bit a different story as we need to
 make sure no data is lost. The principle is the same, but maybe we
 should have two flags: one which is a soft request to avoid rehashes
 of less than N members (and refuse N=numOwners ?), one which is just
 disable it and don't care: data might be in a cachestore, data might
 not be important. Which reminds me, we should consider 

Re: [infinispan-dev] Proposal: ISPN-1394 Manual rehashing in 5.2

2012-02-01 Thread Dan Berindei
On Wed, Feb 1, 2012 at 1:46 PM, Sanne Grinovero sa...@infinispan.org wrote:
 On 1 February 2012 11:23, Dan Berindei dan.berin...@gmail.com wrote:
 Bela, you're right, this is essentially what we talked about in Lisbon:
 https://community.jboss.org/wiki/AsymmetricCachesAndManualRehashingDesign

 For joins I actually started working on a policy of coalescing joins
 that happen one after the other in a short time interval. The current
 implementation is very primitive, as I shifted focus to stability, but
 it does coalesce joins 1 second after another join started (or while
 that join is still running).

 I don't quite agree with Sanne's assessment that it's fine for
 getCache() to block for 5 minutes until the administrator allows the
 new node to join. We should modify startCaches() instead to signal to
 the coordinator that we are ready to receive data for one or all of
 the defined caches, and wait with a customizable time limit until the
 caches have properly joined the cluster.

 The getCache() timeout should not be increased at all. Instead I would
 propose that getCache() returns a functional cache immediately, even
 if the cache didn't receive any data, and it works solely as an L1
 cache until the administrator allows it to join. I'd even make it
 possible to designate a cache as an L1-only cache, so it's never an
 owner for any key.

 I agree that would be very nice, but makes it much more complex to
 implement in 5.2 as well: functional L1 means that the other nodes
 must accept this node as part of the grid, including for L1
 invalidation purposes.

I don't think L1 would be a problem, the L1 code doesn't assume that
the requestor is in the CH. That would basically be the only
difference between a normal node and a L1-only node.

However, I'm sure there will be problems in other areas, so I wouldn't
push this for 5.2.

 So my proposal on blocking until ready is to make a first step, and I
 think it would still be very useful for people wanting to boot some
 ~100 nodes. Blocking the application is not a big deal, as you're
 delaying boot of an application which was likely not even powered on
 before.
 When adding several new nodes, I just want them to add all at once,
 so preventing intermediate rehashing: until all have joined you should
 block rehash - that's a manual (or more likely automated externally)
 step and will not be engaged for long, nor it would replace normal
 behaviour when disabled.

 Actually even more useful would be to start a node with such an
 no-rehash option enabled.


The rehash or no-rehash decision has to be made on the coordinator,
he's the one that's going to trigger the rehash.
So any flag that allows or disallows rehashing has to be propagated to
the coordinator, and maybe even to the entire cluster in case the
coordinator goes down.



 For leaves, the main problem is that every node has to compute the
 same primary owner for a key, at all times. So we need a 2PC cache
 view installation immediately after any leave to ensure that every
 node determines the primary owner in the same way - we can't coalesce
 or postpone leaves.

 For 5.2 I will try to decouple the cache view installation from the
 state transfer, so in theory we will be able to coalesce/postpone the
 state transfer for leaves as well
 (https://issues.jboss.org/browse/ISPN-1827). I'm kind of need it for
 non-blocking state transfer, because with the current implementation a
 leave forces us to cancel any state transfer in progress and restart
 with the updated cache view - a state transfer rollback will be very
 expensive with NBST.


 Erik does raise a valid point - with TACH, if we bring up a node with
 a different siteId, then it will be an owner for all the keys in the
 cache. That node probably isn't provisioned to hold all the keys, so
 it would very likely run out of memory or evict much of the data. I
 guess that makes it a 5.2 issue?

 That's exactly my reasoning as well. There are situations in which you
 want to add several nodes once, which is very different than in
 rapid sequence as that would storm the network with data shuffling in
 all directions.


I don't think the network traffic would be that bad, if the first
rehash takes a long time then all other joiners will queue up and join
at the same time. So we're looking at just one extra rehash, but the
problem is with that first rehash.


 Shutting down a site should be possible even with what we have now -
 just insert a DISCARD protocol in the JGroups stack of all the nodes
 that are shutting down, and when FD finally times out on the nodes in
 the surviving datacenter they won't have any state transfer to do
 (although it may cause a few failed state transfer attempts). We could
 make it simpler though.

 Yes, use a pair of scissors ;-)

 --Sanne



 Cheers
 Dan


 On Tue, Jan 31, 2012 at 6:21 PM, Erik Salter an1...@hotmail.com wrote:
 ...such as bringing up a backup data center.

 -Original Message-
 From: infinispan-dev

Re: [infinispan-dev] again: no physical address

2012-02-01 Thread Dan Berindei
Hi Bela

I think I found why you weren't seeing the warnings. The
bench-log4j.xml in github master is configured to log only to the log
file (benchmark.log). If you add an appender-ref ref=CONSOLE/
you'll see the warnings on the console as well.

I am now able to reproduce it pretty reliably, even from IntelliJ. I
created a run configuration for the Transactional class, I set the
module to infinispan-5.1-SNAPSHOT, and I added these JVM arguments:

-Dbench.loops=3000 -Dbench.writerThreads=33 -Dbench.readerThreads=10
-Dbench.dist=true -Dbench.transactional=true -Dbench.numkeys=500
-Dbench.payloadsize=10240 -Dbench.nodes=12 -Djava.net.preferIPv4Stack
-Dlog4j.configuration=bench-log4j.xml


In Sanne's thread dump it was a ClusteredGetCommand that was waiting
for a physical address, but in my test runs it's usually a JOIN that's
failing. And it seems that it happens because we got a discovery
response from 3 nodes (that is our initial num_initial_members), but
none of them was the coordinator:

2012-02-02 08:04:41,672 TRACE [org.jgroups.protocols.PING] (main)
discovery took 8 ms: responses: 3 total (2 servers (0 coord), 1
clients)
2012-02-02 08:04:41,672 TRACE [org.jgroups.protocols.pbcast.GMS]
(main) localhost-47135: initial_mbrs are localhost-36539
localhost-30921
2012-02-02 08:04:41,672 DEBUG [org.jgroups.protocols.pbcast.GMS]
(main) election results: {localhost-32436=2}
2012-02-02 08:04:41,672 DEBUG [org.jgroups.protocols.pbcast.GMS]
(main) sending JOIN(localhost-47135) to localhost-32436
2012-02-02 08:04:41,673 TRACE [org.jgroups.protocols.UNICAST2] (main)
localhost-47135: created connection to localhost-32436 (conn_id=0)
2012-02-02 08:04:41,673 TRACE [org.jgroups.protocols.UNICAST2] (main)
localhost-47135 -- DATA(localhost-32436: #1, conn_id=0, first)
2012-02-02 08:04:41,673 TRACE [org.jgroups.protocols.UDP] (main)
sending msg to localhost-32436, src=localhost-47135, headers are GMS:
GmsHeader[JOIN_REQ]: mbr=localhost-47135, UNICAST2: DATA, seqno=1,
first, UDP: [channel_name=ISPN]
2012-02-02 08:04:41,673 WARN  [org.jgroups.protocols.UDP] (main)
localhost-47135: no physical address for localhost-32436, dropping
message


Cheers
Dan


On Wed, Feb 1, 2012 at 7:12 PM, Sanne Grinovero sa...@infinispan.org wrote:
 On 1 February 2012 16:40, Bela Ban b...@redhat.com wrote:
 Your benchmark is giving me the creeps !

 Manik was the original author, I've only been adapting it slightly to
 identify performance issues. I wouldn't have used Maven either, but
 it's serving me well especially since it turns out I have to
 frequently change JGroups version ;-)

 First, which version of JGroups / Infinispan does this use ? Second, is
 there a way to start this in an IDE rather than through maven ? Third, I
 don't think bench-jgroups.xml is picked up at all ! How do I make a change
 to bench-jgroups.xml and have Transactional *use* it ?

 1) The intention is to compare Infinispan versions, and see how we
 make progress, so the JGroups version is defined as each different
 Infinispan version was released with, or in case of a SNAPSHOT the
 version it's currently using.

 So, assuming you have a fresh snapshot of Infinispan 5.1.0-SNAPSHOT,
 it's JGroups 3.0.4.Final

 2) As I pointed out in the previous mail, I've reconfigured it for you
 to not pick up the bench-jgroups.xml but go with Infinispan's default
 UDP configuration.
 You have to comment line 15 of bench.sh to use the bench-jgroups.xml,
 but then I can't reproduce the issue anymore.

 This maven crap hides so much, I never know what's going on under the covers
 !@$@ Do I have to do mvn install or mvn package when I make a change to
 bench-jgroups.xml ?

 Aren't you using bench.sh as I pointed out in the script in my
 previous mail? It does mvn install before running it.
 But it's also configured to NOT use bench-jgroups.xml, but rather
 jgroups-udp.xml.

 I don't think I'll ever switch to this f*cking piece of shit !

 I don't think Maven is to be blamed today! What's wrong?

 Anyway one of the nice things of this little benchmark is exactly that
 it it's a single class with a main file, so you can just import it in
 you IDE and run. Any IDE will pick the correct dependencies, thanks to
 Maven. Just that if you do, it will use the default test properties as
 hardcoded in the test class org.infinispan.benchmark.Transactional:
 please set the same environment variable as bench.sh does, unless you
 don't want to run my same configuration.

 Just ping me for any doubt.

 Cheers,
 Sanne


 Please ping when as soon as you've calmed down... :-)
 Cheers,




 On 2/1/12 4:52 PM, Sanne Grinovero wrote:

 On 1 February 2012 15:18, Bela Banb...@redhat.com  wrote:



 On 2/1/12 10:25 AM, Dan Berindei wrote:

 That's not the way it works; at startup of F, it sends its IP address
 with the discovery request. Everybody returns its IP address with the
 discovery response, so even though we have F only talking to A (the
 coordinator) initially, F will also know the IP addresses

Re: [infinispan-dev] DIST.retrieveFromRemoteSource

2012-02-05 Thread Dan Berindei
On Sat, Feb 4, 2012 at 5:23 PM, Bela Ban b...@redhat.com wrote:
 No, Socket.send() is not a hotspot in the Java code.

 On the networking level, with UDP datagrams, a packet is placed into a
 buffer and send() returns after the packet has been placed into the
 buffer successfully. Some network stacks simply discard the packet if
 the buffer is full, I assume.

 With TCP, send() blocks until the packet has been placed into the buffer
 and an ack has been received. If the receiver can't keep up with the
 sending rate, it'll reduce the window to 0, so send() blocks until the
 receiver catches up.

 So if you send 2 requests in paralle over TCP, both threads would
 block() on the send.

 So I don't think parallelization would make sense here, unless of course
 you're doing something else, such as serialization, lock acquisition etc...


You're probably right...

When I sent the email I had only seen that DataGramSocket.send takes
0.4% of the get worker's total time (which is huge considering that
99.3% is spent waiting for the remote node's response). I didn't
notice that it's synchronized and the actual sending only takes half
of that - sending the messages in parallel would create more lock
contention on the socket *and* in JGroups.

On the other hand, we're sending the messages to two different nodes,
on two different sockets, so sending in parallel *may* improve the
response time in scenarios with few worker threads. Certainly not
worth making our heavy-load performance worse, but I thought it was
worth mentioning.


Anyway, my primary motivation for this question was that I believe we
could use GroupRequest instead of our FutureCollator to send our
commands in parallel. They both do essentially the same thing, except
FutureCollator has and extra indirection layer because it uses
UnicastRequest. If there is any performance discrepancy at the moment
between GroupRequest and FutureCollator+UnicastRequest, I don't think
there is any fundamental reason why we can't fix GroupRequest to be
just as efficient as FutureCollator+UnicastRequest.

I think I just saw one such fix: in RequestCorrelator.sendRequest, if
anycasting is enabled then it's making a copy of the buffer for each
target member. I don't think that is necessary at all, in fact I think
it should reuse both the buffer and the headers and only change the
destination address.

Cheers
Dan



 On 2/4/12 3:43 PM, Manik Surtani wrote:
 Is that a micro-optimisation?  Do we know that socket.send() really is a 
 hotspot?

 On 1 Feb 2012, at 00:11, Dan Berindei wrote:

 It's true, but then JGroups' GroupRequest does exactly the same thing...

 socket.send() takes some time too, I thought sending the request in
 parallel would mean calling socket.send() on a separate thread for
 each recipient.

 Cheers
 Dan


 On Fri, Jan 27, 2012 at 6:41 PM, Manik Surtanima...@jboss.org  wrote:
 Doesn't setBlockForResults(false) mean that we're not waiting on a 
 response, and can proceed to the next message to the next recipient?

 On 27 Jan 2012, at 16:34, Dan Berindei wrote:

 Manik, Bela, I think we send the requests sequentially as well. In
 ReplicationTask.call:

                for (Address a : targets) {
                   NotifyingFutureObject  f =
 sendMessageWithFuture(constructMessage(buf, a), opts);
                   futureCollator.watchFuture(f, a);
                }


 In MessageDispatcher.sendMessageWithFuture:

         UnicastRequestT  req=new UnicastRequestT(msg, corr, dest, 
 options);
         req.setBlockForResults(false);
         req.execute();


 Did we use to send each request on a separate thread?


 Cheers
 Dan


 On Fri, Jan 27, 2012 at 1:21 PM, Bela Banb...@redhat.com  wrote:
 yes.

 On 1/27/12 12:13 PM, Manik Surtani wrote:

 On 25 Jan 2012, at 09:42, Bela Ban wrote:

 No, parallel unicasts will be faster, as an anycast to A,B,C sends the
 unicasts sequentially

 Is this still the case in JG 3.x?


 --
 Bela Ban
 Lead JGroups (http://www.jgroups.org)
 JBoss / Red Hat
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org




 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org




 ___
 infinispan-dev mailing list
 infinispan-dev

Re: [infinispan-dev] Proposal: ISPN-1394 Manual rehashing in 5.2

2012-02-05 Thread Dan Berindei
On Sat, Feb 4, 2012 at 4:49 PM, Manik Surtani ma...@jboss.org wrote:

 On 1 Feb 2012, at 12:23, Dan Berindei wrote:

 Bela, you're right, this is essentially what we talked about in Lisbon:
 https://community.jboss.org/wiki/AsymmetricCachesAndManualRehashingDesign

 For joins I actually started working on a policy of coalescing joins
 that happen one after the other in a short time interval. The current
 implementation is very primitive, as I shifted focus to stability, but
 it does coalesce joins 1 second after another join started (or while
 that join is still running).

 I don't quite agree with Sanne's assessment that it's fine for
 getCache() to block for 5 minutes until the administrator allows the
 new node to join. We should modify startCaches() instead to signal to
 the coordinator that we are ready to receive data for one or all of
 the defined caches, and wait with a customizable time limit until the
 caches have properly joined the cluster.

 The getCache() timeout should not be increased at all. Instead I would
 propose that getCache() returns a functional cache immediately, even
 if the cache didn't receive any data, and it works solely as an L1
 cache until the administrator allows it to join. I'd even make it
 possible to designate a cache as an L1-only cache, so it's never an
 owner for any key.

 I presume this would be encoded in the Address?  That would make sense for a 
 node permanently designated as an L1 node.  But then how would this work for 
 a node temporarily acting as L1 only, until it has been allowed to join?  
 Change the Address instance on the fly?  A delegating Address?  :/


Nope, not in the Address. Since we now have virtual cache views, every
node has to explicitly request to join each cache. It would be quite
easy to add an L1-only flag to the join request, and nodes with that
flag would never be included in the proper cache view. They would
however get a copy of the cache view on join and on every cache view
installation so they could update their CH and send requests to the
proper owners.

Nodes acting temporarily as L1-only would send a normal join request,
but they would also receive a copy of the cache view and getCache()
would return immediately instead of waiting for the node to receive
state. When the coordinator finally installs a cache view that
includes them, they will perform the initial state transfer as they do
now.


 For leaves, the main problem is that every node has to compute the
 same primary owner for a key, at all times. So we need a 2PC cache
 view installation immediately after any leave to ensure that every
 node determines the primary owner in the same way - we can't coalesce
 or postpone leaves.

 Yes, manual rehashing would probably just be for joins.  Controlled shutdown 
 in itself is manual, and crashes, well, need to be dealt with immediately IMO.


We could extend the policy for craches as well, by adding a
minNumOwners setting and only triggering an automatic rehash when
there is a segment on the hash wheel with = minNumOwners owners.

We would have a properly installed CH, that guarantees at some point
in the past each key had numOwners owners, and a filter on top of it
that removes any leavers from the result of DM.locate(),
DM.getPrimaryLocation() etc.

It would probably undo our recent optimizations around locate and
getPrimaryLocation, so it's slowing the normal case (without any
leavers) in order to make the exceptional case (organized shutdown or
a part of the cluster) faster. The question is how big the cluster has
to get before the exceptional case becomes common enough that it's
worth optimizing for...



 For 5.2 I will try to decouple the cache view installation from the
 state transfer, so in theory we will be able to coalesce/postpone the
 state transfer for leaves as well
 (https://issues.jboss.org/browse/ISPN-1827). I'm kind of need it for
 non-blocking state transfer, because with the current implementation a
 leave forces us to cancel any state transfer in progress and restart
 with the updated cache view - a state transfer rollback will be very
 expensive with NBST.


 Erik does raise a valid point - with TACH, if we bring up a node with
 a different siteId, then it will be an owner for all the keys in the
 cache. That node probably isn't provisioned to hold all the keys, so
 it would very likely run out of memory or evict much of the data. I
 guess that makes it a 5.2 issue?

 Yes.

 Shutting down a site should be possible even with what we have now -
 just insert a DISCARD protocol in the JGroups stack of all the nodes
 that are shutting down, and when FD finally times out on the nodes in
 the surviving datacenter they won't have any state transfer to do
 (although it may cause a few failed state transfer attempts). We could
 make it simpler though.


 Cheers
 Dan


 On Tue, Jan 31, 2012 at 6:21 PM, Erik Salter an1...@hotmail.com wrote:
 ...such as bringing up a backup data center.

 -Original Message

Re: [infinispan-dev] DIST.retrieveFromRemoteSource

2012-02-05 Thread Dan Berindei
On Sun, Feb 5, 2012 at 12:24 PM, Bela Ban b...@redhat.com wrote:


 On 2/5/12 9:44 AM, Dan Berindei wrote:
 On Sat, Feb 4, 2012 at 5:23 PM, Bela Banb...@redhat.com  wrote:
 No, Socket.send() is not a hotspot in the Java code.

 On the networking level, with UDP datagrams, a packet is placed into a
 buffer and send() returns after the packet has been placed into the
 buffer successfully. Some network stacks simply discard the packet if
 the buffer is full, I assume.

 With TCP, send() blocks until the packet has been placed into the buffer
 and an ack has been received. If the receiver can't keep up with the
 sending rate, it'll reduce the window to 0, so send() blocks until the
 receiver catches up.

 So if you send 2 requests in paralle over TCP, both threads would
 block() on the send.

 So I don't think parallelization would make sense here, unless of course
 you're doing something else, such as serialization, lock acquisition etc...


 You're probably right...

 When I sent the email I had only seen that DataGramSocket.send takes
 0.4% of the get worker's total time (which is huge considering that
 99.3% is spent waiting for the remote node's response). I didn't
 notice that it's synchronized and the actual sending only takes half
 of that - sending the messages in parallel would create more lock
 contention on the socket *and* in JGroups.

 On the other hand, we're sending the messages to two different nodes,
 on two different sockets, so sending in parallel *may* improve the
 response time in scenarios with few worker threads. Certainly not
 worth making our heavy-load performance worse, but I thought it was
 worth mentioning.


 If you think this makes a difference, why don't you make a temporary
 code change and measure its effect on performance ?


I will give it a try, it's just that writing emails is a bit easier ;-)


 Anyway, my primary motivation for this question was that I believe we
 could use GroupRequest instead of our FutureCollator to send our
 commands in parallel. They both do essentially the same thing, except
 FutureCollator has and extra indirection layer because it uses
 UnicastRequest.


 Ah, ok. Yes, I think that 1 GroupRequest of 2 might be more efficient
 than 2 UnicastRequests for sending an 'anycast' message to 2 members.
 Perhaps the marshalling is done only once (I'd have to confirm that
 though) and we're only creating 1 data structure and add it to a hashmap
 (request/response correlation)... Certainly worth giving a try...


We are only serializing once with the FutureCollator approach, and we
don't copy the buffer either.


  If there is any performance discrepancy at the moment
 between GroupRequest and FutureCollator+UnicastRequest, I don't think
 there is any fundamental reason why we can't fix GroupRequest to be
 just as efficient as FutureCollator+UnicastRequest.


 You're assuming that FutureCollator+UnicastRequest is faster than
 GroupRequest, what are you basing your assumption on ? As I outlined
 above, I'd rather assume the opposite, although I don't know FutureCollator.


I did say *if*...

 Maybe we should have a chat next week to go through the code that sends
 an anycast to 2 members, wdyt ?


Sounds good, we should also also talk about how to implement staggered
get requests best.


 I think I just saw one such fix: in RequestCorrelator.sendRequest, if
 anycasting is enabled then it's making a copy of the buffer for each
 target member. I don't think that is necessary at all, in fact I think
 it should reuse both the buffer and the headers and only change the
 destination address.

 No, this *is* necessary, I don't just make copies because I think this
 is fun !! :-)

 Remember that a message might be retransmitted, so it is placed into a
 retransmit buffer. If M1 has destination A and M2 has destination B, and
 we send M1 first (to A), then change M1's destination to B, and send it,
 everything is fine. However, if we later get a retransmit request from
 B, we'd resend the message to A instead ! This is just 1 example,
 modifications of headers is another one.

 Note that the copy does *not* copy the buffer (payload) itself, but only
 references it, so this is fast. Of course, nobody is supposed to modify
 the contents of the buffer itself...


I wasn't clear enough, but I didn't mean we should reuse the entire
Message object. I meant we should copy the Message but not the buffer
or the headers. I see now that protocols may be adding new headers, so
it wouldn't be safe to reuse the headers collection.

I think this line in
RequestCorrelator.sendRequest(RequestCorrelator.java:152) means that
the contents of the buffer is copied in the new message, not just the
buffer reference:

Message copy=msg.copy(true);


Cheers
Dan

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] DIST.retrieveFromRemoteSource

2012-02-05 Thread Dan Berindei
On Sun, Feb 5, 2012 at 5:56 PM, Bela Ban b...@redhat.com wrote:


 On 2/5/12 4:40 PM, Dan Berindei wrote:


 Remember that a message might be retransmitted, so it is placed into a
 retransmit buffer. If M1 has destination A and M2 has destination B, and
 we send M1 first (to A), then change M1's destination to B, and send it,
 everything is fine. However, if we later get a retransmit request from
 B, we'd resend the message to A instead ! This is just 1 example,
 modifications of headers is another one.

 Note that the copy does *not* copy the buffer (payload) itself, but only
 references it, so this is fast. Of course, nobody is supposed to modify
 the contents of the buffer itself...


 I wasn't clear enough, but I didn't mean we should reuse the entire
 Message object. I meant we should copy the Message but not the buffer
 or the headers. I see now that protocols may be adding new headers, so
 it wouldn't be safe to reuse the headers collection.

 I think this line in
 RequestCorrelator.sendRequest(RequestCorrelator.java:152) means that
 the contents of the buffer is copied in the new message, not just the
 buffer reference:

                  Message copy=msg.copy(true);


 No, this does *not* copy the buffer, but simply references the same buffer.


Aha, I thought copy_buffer == true meant copy the contents and
copy_buffer == false meant share the contents. I see copy_buffer ==
true actually means copy the reference, share the contents and
copy_buffer == false means don't copy anything.

I will modify our CommandAwareRpcDispatcher to use GroupRequest and
see how they compare, then we can continue this discussion with the
results.

Cheers
Dan

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


  1   2   3   4   5   6   7   >