Re: [infinispan-dev] Why DistTxInterceptor in use with Hot Rod ?

2011-09-14 Thread Dan Berindei
Going back to your original question Galder, the exception is most
likely thrown because of this sequence of events:

0. Given a cluster {A, B}, a key k and a node C joining.
1. Put acquires the transaction lock on node A (blocking rehashing)
2. Put acquires lock for key k on node A
3. Rehashing starts on node B, blocking transactions
4. Put tries to acquire transaction lock on node B

Since it's impossible to finish rehashing while the put operation
keeps the transaction lock on node A, the best option was to kill the
put operation by throwing a RehashInProgressException.

I was thinking in the context of transactions when I wrote this code
(see 
https://github.com/danberindei/infinispan/commit/6ed94d3b2e184d4a48d4e781db8d404baf5915a3,
this scenario became just a footnote to the generic case with multiple
caches), but the scenario also occurs without transactions. Actually I
renamed it state transfer lock and I moved it to a separate
interceptor in my ISPN-1194 branch.

Maybe the locking changes in 5.1 will eliminate this scenario, but
otherwise we could improve the user experience by retrying the command
after the rehashing finishes.

Dan


On Tue, Sep 13, 2011 at 8:11 PM, Galder Zamarreño gal...@redhat.com wrote:

 On Sep 13, 2011, at 6:38 PM, Mircea Markus wrote:


 On 13 Sep 2011, at 17:22, Galder Zamarreño wrote:

 Hi,

 I'm looking at this failure http://goo.gl/NQw4h and I'm wondering why 
 org.infinispan.distribution.RehashInProgressException: Timed out waiting 
 for the transaction lock is thrown?

 This is thrown DistTxInterceptor which is added by the 
 InterceptorChainFactory:

     if (configuration.getCacheMode().isDistributed())
        
 interceptorChain.appendInterceptor(createInterceptor(DistTxInterceptor.class));
     else
        
 interceptorChain.appendInterceptor(createInterceptor(TxInterceptor.class));

 However, this is a Hot Rod test and the cache is not configured with a 
 transaction manager, so is DistTxInterceptor really needed?

 In fact, why a TxInterceptor if no transaction manager is configured? (this 
 kinda goes back to Mircea's email earlier today about a transactions 
 enabled/disabled flag)
 All valid concerns, IMO.
 5.1 will clarify this by not adding Tx interceptors for non tx caches.

 Is this captured in some other JIRA? I'd like to reference the test and the 
 JIRA from ISPN-1123 (stabilising testsuite jira)



 Cheers,
 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Infinispan security?

2011-09-14 Thread Manik Surtani

On 11 Sep 2011, at 16:40, Joni Hahkala wrote:

 On 07/09/2011 18:04, Manik Surtani wrote:
 On 2 Sep 2011, at 13:04, Joni Hahkala wrote:
 
 Is there any performance numbers for infinispan? What kind of response
 times would be required from secure version and what they are now etc?
 No had requirements as such, since I think anyone expecting to deploy a 
 secure data grid will expect performance tradeoffs.  What sort of factors do 
 you envisage with the approaches you outlined below?
 
 For ssl, you would have the problem of the handshake, you have two 
 request-response cycles before you can even start to send the actual data. 
 So, with anything else than intra cluster networking, you get bitten by the 
 network latency. You can resume old ssl sessions, which save one 
 request-response cycle, or you can keep the sockets open, and only do the 
 handshake once, but at least keeping the sockets open doesn't scale so far. 
 Then there is also the certificate checking, but that shouln't be that big of 
 an issue unless you want to go for ultimate speed.

Well, for inter-node traffic, this could be a problem since there will be a lot 
of chatter.  And if we need to perform a handshake each time, that will kill 
performance.  Unless, as you say, persistent connections can be maintained.  In 
which case your real overhead is then just the encryption and signing.

 With everything there is the overhead of encryption and possibly the signing, 
 but that shouldn't be such a big slowdown.
 
 The Seam framework seems to be more of user authentication, and would 
 probably be good for managing the authorization information. But for 
 authentication between the nodes and clients maybe keys would be better, kind 
 of like ssh is doing.
 
 Cheers,
 Joni
 
 
 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani
 
 Lead, Infinispan
 http://www.infinispan.org
 
 
 
 
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
 

--
Manik Surtani
ma...@jboss.org
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org




___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] multi-mapping with indexing - do we need big-table

2011-09-14 Thread Manik Surtani
Hi Kapil

After reading through this again, it is indeed an interesting use case.  My 
comments inline:

On 9 Sep 2011, at 05:23, kapil nayar wrote:

 We have two data sets {A1, A2, A3...} and {B1, B2, B3...}
 Each B has some associated data {C1, C2, C3}  which has 1:1 mapping. 
 
 The mappings would be something like (assume that C would be stored along 
 side B):
 A1- B1, B2
 A2- B3, B5
 A3- B4, B6, B7
 
 Now, we would need the following indexes:
 A-B and B-A
 
 Notice, that both are unique mappings. However, as shown A has multiple 
 mappings to B.
 The big-table type of data structure allow this and make it pretty easy off 
 the shelf.
 
 Now, I am trying to explore if we can implement these mappings with 
 Infinispan.
 We may need a basic multi-map - to store multiple values for the same key in 
 the cache.
 
 1. The get would return the complete list of the values.
 2. The put would add the new value without replacing the existing value.
 3. The remove would remove a specific value or optionally all values 
 associated with the key.
 4. These operations (especially put) on the same key can occur 
 simultaneously from multiple nodes.
 
 I know there is an atomic map option in Infinispan which may be applicable, 
 but AFAIK it requires transactions (which we want to avoid..).

The AtomicMap does do this, but will lock the entire map for any operation.  
We're working on a FineGrainedMap as well, which will allow concurrent updates 
to contents within the map.  See https://issues.jboss.org/browse/ISPN-1115

However this too is likely to require JTA transactions for consistency.  Could 
you explain why you wish to avoid transactions?

 
 Alternatively, perhaps Infinispan (in combination with lucene) can be used.
 1. We should be able to create data structure {B, C} and store A- {B,C} with 
 indexes defined for B.
 2. Also, the key A could be structured as a combination of A+B to store 
 multiple entries like A1B1-{B1,C1} and A1B2-{B2,C2}. Lucene would allow 
 wild carded searches. e.g. To look for all A1 values we could do something 
 like A1* which should return both A1B1 and A2B2I may be making some 
 assumptions here (feel free to correct!)

Yes, this should be possible.

 3. There seems to be one bottleneck though - since the cache mode is 
 distribution, it seems it is mandatory to use a backend DB to store these 
 indexes and moreover the DB needs to be shared. This requirement actually 
 seems to defeat the purpose of using Infinispan.

Not necessarily.  You can configure Lucene to store indexes in a replicated 
Infinispan cache as well.  This means the indexes are globally available, and 
in-memory.  You would need a lot of memory though!  :)

Cheers
Manik
--
Manik Surtani
ma...@jboss.org
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org



___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Externalizer and Commands ID ranges

2011-09-14 Thread Galder Zamarreño


On Sep 13, 2011, at 12:24 PM, Sanne Grinovero wrote:

 On 13 September 2011 11:25, Galder Zamarreño gal...@redhat.com wrote:
 
 On Sep 12, 2011, at 1:32 PM, Sanne Grinovero wrote:
 
 I've added a new table as suggested by Galder, and sent a pull request
 to constrain Query in the range I just defined [1]
 
 For this case we only have a single byte to share across all commands.
 Is it reasonable to reserve 100 values for Infinispan core, and blocks
 of 20 for needing modules?
 
 1 - https://github.com/infinispan/infinispan/pull/526
 
 No need to reserve for Infinispan core cos that's indexed differently to the 
 externally provided commands. Same thing happens for externally defined 
 Externalizers.
 
 That would be nice, but it's not the case currently: If I change the
 org.infinispan.query.ModuleCommandIds.CLUSTERED_QUERY from 101 to 12
 the test org.infinispan.query.blackbox.ClusteredQueryTest
 is going to throw several exceptions with the following stacktrace:
 
 Caused by: java.lang.ClassCastException:
 org.infinispan.query.clustered.ClusteredQueryCommand cannot be cast to
 org.infinispan.commands.tx.PrepareCommand
   at 
 org.infinispan.commands.CommandsFactoryImpl.initializeReplicableCommand(CommandsFactoryImpl.java:277)
   at 
 org.infinispan.remoting.InboundInvocationHandlerImpl.handleInternal(InboundInvocationHandlerImpl.java:180)
   at 
 org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithWaitForBlocks(InboundInvocationHandlerImpl.java:199)
   at 
 org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithRetry(InboundInvocationHandlerImpl.java:319)
   at 
 org.infinispan.remoting.InboundInvocationHandlerImpl.handle(InboundInvocationHandlerImpl.java:171)
   at 
 org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommand(CommandAwareRpcDispatcher.java:165)
   at 
 org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:144)
   ... 22 more
 
 Do I have to open a JIRA to change this or are we going to keep the
 first 100 reserved for core?

Hmmm, I think a new JIRA is needed here, cos at first glance, it means that 
https://issues.jboss.org/browse/ISPN-1162 is not fully resolved.

We should not wait until execution to discover this. Given that we already 
check for dups for internal cmds, and ModuleProperties checks for externally 
provided ones, I decided to keep both worlds separate. So, the testcase you 
show above should work.

 
 Sanne
 
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


[infinispan-dev] Versioned entries - overview of design, looking for comments

2011-09-14 Thread Manik Surtani
So I've been hacking on versioned entries for a bit now, and want to run the 
designs by everyone. Adding an EntryVersion to each entry is easy, making this 
optional and null by default easy too, and a SimpleVersion a wrapper around a 
long and a PartitionTolerantVersion being a vector clock implementation.  Also 
easy stuff, changing the entry hierarchy and the marshalling to ensure versions 
- if available - are shipped, etc.

Comparing versions would happen in Mircea's optimistic locking code, on 
prepare, when a write skew check is done.  If running in a non-clustered 
environment, the simple object-identity check we currently have is enough; 
otherwise an EntryVersion.compare() will need to happen, with one of 4 possible 
results: equal, newer than, older than, or concurrently modified.  The last one 
can only happen if you have a PartitionTolerantVersion, and will indicate a 
split brain and simultaneous update.

Now the hard part.  Who increments the version?  We have a few options, all 
seem expensive.

1) The modifying node.  If the modifying node is a data owner, then easy.  
Otherwise the modifying node *has* to do a remote GET first (or at least a 
GET_VERSION) before doing a PUT.  Extra RPC per entry.  Sucks.

2) The data owner.  This would have to happen on the primary data owner only, 
and the primary data owner would need to perform the write skew check.  NOT the 
modifying node.  The modifying node would also need to increment and ship its 
own NodeClock along with the modification. Extra info to ship per commit.

I'm guessing we go with #2, but would like to hear your thoughts.

Cheers
Manik

--
Manik Surtani
ma...@jboss.org
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org




___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] ISPN-1374 and ModuleProperties, static collections and class loading

2011-09-14 Thread Manik Surtani

On 13 Sep 2011, at 00:03, Sanne Grinovero wrote:

 Hi,
 I'm not suggesting that the classloader is completely ignored; it is
 indeed evaluated at the first invocation but then if the following
 method is invoked again with a different classloader as argument, it
 will return the previously cached value:
 
 https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/util/ModuleProperties.java#L135
 
 Note that the method you pointed to is private, and is actually a
 helper for the public methods, which does cache all of it's results in
 static fields.
 
 So assuming this will be invoked by a single classloader only, indeed
 there are no issues. But is that really the case?
 Wasn't the purpose of the classloader parameter to load extensions
 from a deployed application? If so, it seems I can't deploy two
 different applications both attempting to start an Infinispan
 cachemanager.

Well, why would the class loader in this case make a difference, unless you are 
in an OSGi environment?  Remember that this isn't used to load application 
classes.  Just Infinispan module classes.  In this case the OSGi file lookup 
should be able to handle the appropriate loader for each bundle/module.  Will 
need to make sure this works for JBoss AS 7 modules too.

 
 For example, I suspect that you won't be able to deploy an Hibernate
 Search application (or Infinispan Query) and then deploy a Hibernate
 OGM based application in the same container.
 But this is not proven as I didn't try it out, so maybe my assumptions
 about what the goal of this classloader parameter are wrong.

Ah ok, I think I see your problem: that some infinispan modules are bundled 
with an application, using an application-scoped class loader (a web app)?  Ok, 
I can see how that could be a problem then.

 So I think that, iff we need to cache this information, it shouldn't
 be cached in a static field, as discussed as well on

Well, the purpose of caching this info is to prevent each new named Cache from 
re-reading module properties.  Each named cache only reads these properties 
once at startup, so caching this is useless if this isn't shared across named 
caches.  Or perhaps we maintain one such module cache per class loader passed 
in?

Cheers
Manik
--
Manik Surtani
ma...@jboss.org
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org




___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] ISPN-1374 and ModuleProperties, static collections and class loading

2011-09-14 Thread Sanne Grinovero
On 14 September 2011 17:37, Manik Surtani ma...@jboss.org wrote:

 On 13 Sep 2011, at 00:03, Sanne Grinovero wrote:
 For example, I suspect that you won't be able to deploy an Hibernate
 Search application (or Infinispan Query) and then deploy a Hibernate
 OGM based application in the same container.
 But this is not proven as I didn't try it out, so maybe my assumptions
 about what the goal of this classloader parameter are wrong.

 Ah ok, I think I see your problem: that some infinispan modules are bundled 
 with an application, using an application-scoped class loader (a web app)?  
 Ok, I can see how that could be a problem then.

Exactly the point. Unless you can make sure that both OGM and Search
are included in AS7 and special purpose caches are pre-configured out
of the box :-)


 So I think that, iff we need to cache this information, it shouldn't
 be cached in a static field, as discussed as well on

 Well, the purpose of caching this info is to prevent each new named Cache 
 from re-reading module properties.  Each named cache only reads these 
 properties once at startup, so caching this is useless if this isn't shared 
 across named caches.  Or perhaps we maintain one such module cache per class 
 loader passed in?

Since caches can be started only once and should happen in the context
of a startCaches( ... ) context, such a cache could live in the scope
of such an invocation.
Besides solving the (potential?) problem that would also save some
memory as this information would be released right after usage.

I don't think people will be able to reuse the AS7 managed caches for
the purpose of Search or OGM, as for these reasons such extensions
should be available at AS7 boot-time, so we should at least make sure
that starting your own EmbeddedCacheManager is an option, otherwise
I'll be left with two options none of them viable.

Cheers,
Sanne

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

[infinispan-dev] batching and auto-commit

2011-09-14 Thread Mircea Markus
Hi,

ATM I cannot enable both batching and auto-commit[1] because the way the 
batching is implemented:
- it starts a tx, suspends it and and holds it in a thread local so that when a 
put arrives it can resume it
- when I do a put in a batch, the auto-commit code which runs first doesn't see 
any tx associated with the thread and starts and commits a new tx

Is there any reasons why the batch container/interceptor doesn't want to expose 
the batch induced transaction to the outside world? The only drawback I see 
with that is  if some other XA resource is used within the batch, it will 
participate in the dist tx.

[1] auto-commit is a new feature in 5.1 which injects a tx for transactional 
caches so that user won't have to start/stop one for single key operations.

Cheers,
Mircea  
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] batching and auto-commit

2011-09-14 Thread Manik Surtani

On 14 Sep 2011, at 17:01, Mircea Markus wrote:

 Is there any reasons why the batch container/interceptor doesn't want to 
 expose the batch induced transaction to the outside world? The only drawback 
 I see with that is  if some other XA resource is used within the batch, it 
 will participate in the dist tx.

Yup, that's the reason.  

--
Manik Surtani
ma...@jboss.org
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org



___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Versioned entries - overview of design, looking for comments

2011-09-14 Thread Sanne Grinovero
Wouldn't the node performing the operation always do an RPC anyway iff
the intended operation is to replace a specific value?

Examples:

 - If I do a put() operation which doesn't skip the return value, the
RPC has to be perfomed, we get the current version value which is what
we will check to be unchanged when committing the write operation.
This includes putIfAbsent, and all other atomic operations, which
are of common use and all need an RPC anyway. This means that L1
caches will contain the version number as well.

 - If I do a put() operation in which I skip the return value, or a
remove() without any check (nor lock), it seems the user intention is
to overwrite/delete whatever was there without any further check at
commit time. In fact this doesn't acquire the optimistick lock.

- any read operation will need an RPC anyway, (If relevant: I guess
for this case we could have different options if to rollback or
proceed when detecting stale reads at commit time.)

- any lock() operation will force such an RPC.

I don't think I understood the difference between #1 - #2. In both
cases the writing node needs to retrieve the version information, and
the key owner will perform the version increment at commit time, if
the skew check is happy.

Sanne


On 14 September 2011 16:03, Manik Surtani ma...@jboss.org wrote:
 So I've been hacking on versioned entries for a bit now, and want to run the 
 designs by everyone. Adding an EntryVersion to each entry is easy, making 
 this optional and null by default easy too, and a SimpleVersion a wrapper 
 around a long and a PartitionTolerantVersion being a vector clock 
 implementation.  Also easy stuff, changing the entry hierarchy and the 
 marshalling to ensure versions - if available - are shipped, etc.

 Comparing versions would happen in Mircea's optimistic locking code, on 
 prepare, when a write skew check is done.  If running in a non-clustered 
 environment, the simple object-identity check we currently have is enough; 
 otherwise an EntryVersion.compare() will need to happen, with one of 4 
 possible results: equal, newer than, older than, or concurrently modified.  
 The last one can only happen if you have a PartitionTolerantVersion, and will 
 indicate a split brain and simultaneous update.

 Now the hard part.  Who increments the version?  We have a few options, all 
 seem expensive.

 1) The modifying node.  If the modifying node is a data owner, then easy.  
 Otherwise the modifying node *has* to do a remote GET first (or at least a 
 GET_VERSION) before doing a PUT.  Extra RPC per entry.  Sucks.

 2) The data owner.  This would have to happen on the primary data owner only, 
 and the primary data owner would need to perform the write skew check.  NOT 
 the modifying node.  The modifying node would also need to increment and ship 
 its own NodeClock along with the modification. Extra info to ship per commit.

 I'm guessing we go with #2, but would like to hear your thoughts.

 Cheers
 Manik

 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org




 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev