[jira] [Commented] (SLING-8710) IndexingClient should have configurable lanes

2019-09-16 Thread Vikas Saurabh (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-8710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16930534#comment-16930534
 ] 

Vikas Saurabh commented on SLING-8710:
--

Just created a PR 
https://github.com/apache/sling-org-apache-sling-testing-clients/pull/10 and 
realized [~volteanu]'s comment.

I can update the PR to have an explicit {{setLanes}} method - although, I think 
that would likely require updating tests using a pattern like 
{{client.adaptTo(IndexingClient.class).waitForAsyncIndexing}}. So, maybe it's 
ok to simply have extra csv of lanes in sling config map itself?

> IndexingClient should have configurable lanes
> -
>
> Key: SLING-8710
> URL: https://issues.apache.org/jira/browse/SLING-8710
> Project: Sling
>  Issue Type: Improvement
>  Components: Launchpad
>    Reporter: Vikas Saurabh
>Priority: Major
>  Labels: sling-IT
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{IndexingClient}} currently relies on {{OsgiConsoleCient}} to peek into 
> system's lane configuration. That, in turn, requires access to 
> /system/console which might be blocked on the system which is being tested.
> As indexing lanes in a setup isn't a very dynamic property, a quick work 
> around could be for {{IndexingClient}} to allow setting up "expected index 
> lanes in target setup".
> This issue would be an improvement over what was implemented in SLING-7169. 
> [~rombert] and [~volteanu], since most of the comments over there are from 
> you guys, wdyt about this improvement?



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (SLING-8710) IndexingClient should have configurable lanes

2019-09-16 Thread Vikas Saurabh (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-8710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16930445#comment-16930445
 ] 

Vikas Saurabh commented on SLING-8710:
--

bq. repository directly using JCR Remoting, similar to FileVault? Or is that 
API only exposed at the Oak level?
Unfortunately that information is available only at repository setup time and 
can be inspected as Osgi configuration for async indexer (which is being done 
in {{IndexingClient}} atm).

> IndexingClient should have configurable lanes
> -
>
> Key: SLING-8710
> URL: https://issues.apache.org/jira/browse/SLING-8710
> Project: Sling
>  Issue Type: Improvement
>  Components: Launchpad
>    Reporter: Vikas Saurabh
>Priority: Major
>  Labels: sling-IT
>
> {{IndexingClient}} currently relies on {{OsgiConsoleCient}} to peek into 
> system's lane configuration. That, in turn, requires access to 
> /system/console which might be blocked on the system which is being tested.
> As indexing lanes in a setup isn't a very dynamic property, a quick work 
> around could be for {{IndexingClient}} to allow setting up "expected index 
> lanes in target setup".
> This issue would be an improvement over what was implemented in SLING-7169. 
> [~rombert] and [~volteanu], since most of the comments over there are from 
> you guys, wdyt about this improvement?



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (SLING-8710) IndexingClient should have configurable lanes

2019-09-16 Thread Vikas Saurabh (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-8710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16930332#comment-16930332
 ] 

Vikas Saurabh commented on SLING-8710:
--

Ack. In that direction, I think allowing configuration via props (passed in 
ctor) seems like the way forward. The tests would still need to make that 
configuration available though - potentially using some common class as that 
end.

Also, I think we should keep current logic of dynamic lanes evaluation if 
configuration isn't provided.

Wdyt?

> IndexingClient should have configurable lanes
> -
>
> Key: SLING-8710
> URL: https://issues.apache.org/jira/browse/SLING-8710
> Project: Sling
>  Issue Type: Improvement
>  Components: Launchpad
>    Reporter: Vikas Saurabh
>Priority: Major
>  Labels: sling-IT
>
> {{IndexingClient}} currently relies on {{OsgiConsoleCient}} to peek into 
> system's lane configuration. That, in turn, requires access to 
> /system/console which might be blocked on the system which is being tested.
> As indexing lanes in a setup isn't a very dynamic property, a quick work 
> around could be for {{IndexingClient}} to allow setting up "expected index 
> lanes in target setup".
> This issue would be an improvement over what was implemented in SLING-7169. 
> [~rombert] and [~volteanu], since most of the comments over there are from 
> you guys, wdyt about this improvement?



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (SLING-8710) IndexingClient should have configurable lanes

2019-09-16 Thread Vikas Saurabh (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-8710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16930307#comment-16930307
 ] 

Vikas Saurabh commented on SLING-8710:
--

[~volteanu], were you referring to adding configuration (description) or 
extending (earlier comment)

> IndexingClient should have configurable lanes
> -
>
> Key: SLING-8710
> URL: https://issues.apache.org/jira/browse/SLING-8710
> Project: Sling
>  Issue Type: Improvement
>  Components: Launchpad
>    Reporter: Vikas Saurabh
>Priority: Major
>  Labels: sling-IT
>
> {{IndexingClient}} currently relies on {{OsgiConsoleCient}} to peek into 
> system's lane configuration. That, in turn, requires access to 
> /system/console which might be blocked on the system which is being tested.
> As indexing lanes in a setup isn't a very dynamic property, a quick work 
> around could be for {{IndexingClient}} to allow setting up "expected index 
> lanes in target setup".
> This issue would be an improvement over what was implemented in SLING-7169. 
> [~rombert] and [~volteanu], since most of the comments over there are from 
> you guys, wdyt about this improvement?



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (SLING-8710) IndexingClient should have configurable lanes

2019-09-15 Thread Vikas Saurabh (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-8710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16930111#comment-16930111
 ] 

Vikas Saurabh commented on SLING-8710:
--

... alternatively, an argument could be made that it's anyway trivial to extend 
{{IndexingClient}} and override {{IndexingClient#getLaneNames}}.

> IndexingClient should have configurable lanes
> -
>
> Key: SLING-8710
> URL: https://issues.apache.org/jira/browse/SLING-8710
> Project: Sling
>  Issue Type: Improvement
>  Components: Launchpad
>    Reporter: Vikas Saurabh
>Priority: Major
>  Labels: sling-IT
>
> {{IndexingClient}} currently relies on {{OsgiConsoleCient}} to peek into 
> system's lane configuration. That, in turn, requires access to 
> /system/console which might be blocked on the system which is being tested.
> As indexing lanes in a setup isn't a very dynamic property, a quick work 
> around could be for {{IndexingClient}} to allow setting up "expected index 
> lanes in target setup".
> This issue would be an improvement over what was implemented in SLING-7169. 
> [~rombert] and [~volteanu], since most of the comments over there are from 
> you guys, wdyt about this improvement?



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (SLING-8710) IndexingClient should have configurable lanes

2019-09-15 Thread Vikas Saurabh (Jira)
Vikas Saurabh created SLING-8710:


 Summary: IndexingClient should have configurable lanes
 Key: SLING-8710
 URL: https://issues.apache.org/jira/browse/SLING-8710
 Project: Sling
  Issue Type: Improvement
  Components: Launchpad
Reporter: Vikas Saurabh


{{IndexingClient}} currently relies on {{OsgiConsoleCient}} to peek into 
system's lane configuration. That, in turn, requires access to /system/console 
which might be blocked on the system which is being tested.

As indexing lanes in a setup isn't a very dynamic property, a quick work around 
could be for {{IndexingClient}} to allow setting up "expected index lanes in 
target setup".

This issue would be an improvement over what was implemented in SLING-7169. 
[~rombert] and [~volteanu], since most of the comments over there are from you 
guys, wdyt about this improvement?



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (SLING-8407) JobManagerImpl.findJobs should prevent traversal

2019-05-10 Thread Vikas Saurabh (JIRA)


[ 
https://issues.apache.org/jira/browse/SLING-8407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16837424#comment-16837424
 ] 

Vikas Saurabh commented on SLING-8407:
--

[~marett], option traversal fail is a loose hint to pak to fail fast if the 
best available option is to traverse. It's akin to Index hints in usual 
parlance (well oak supports real index hint using option index tag... )

Going towards this new behavior using a configuration is already something that 
[~egli] proposed and [~tmueller] updated his patch accordingly.

> JobManagerImpl.findJobs should prevent traversal
> 
>
> Key: SLING-8407
> URL: https://issues.apache.org/jira/browse/SLING-8407
> Project: Sling
>  Issue Type: Improvement
>  Components: Event
>Reporter: Thomas Mueller
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The method 
> [JobManagerImpl.findJobs|https://github.com/apache/sling-org-apache-sling-event/blob/master/src/main/java/org/apache/sling/event/impl/jobs/JobManagerImpl.java#L373]
>  runs a JCR query to find all jobs for a topic.
> It is possible that such a query is running while the repository isn't 
> initialized yet, meaning while the index isn't available yet. What is 
> happening in this case is that the query is traversing all nodes below that 
> path, triggering a warning that the query doesn't use an index. It is 
> sometimes happening when a health check is running before the repository is 
> initialized (ReplicationQueueHealthCheck and DistributionQueueHealthCheck).
> It doesn't make sense that the query traverses the nodes. It should use an 
> index. If the index isn't available yet, it should fail. Therefore, the query 
> should use "option(traversal fail)". That would result in an exception that 
> can be caught.  I will log a related issue to change the health checks to 
> process this exception and return HEALTH_CHECK_ERROR for this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SLING-8407) JobManagerImpl.findJobs should prevent traversal

2019-05-10 Thread Vikas Saurabh (JIRA)


[ 
https://issues.apache.org/jira/browse/SLING-8407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16837320#comment-16837320
 ] 

Vikas Saurabh commented on SLING-8407:
--

[~marett],
bq.   to wait on a specific marker service and thus a specific index.
actually that won't work as well. Queries and indexes are de-coupled (not just 
in oak ... in a lot of DBs too). So, making a client know "which" index it 
should use for a query is troublesome in long run (I think). The intention is 
probably that the query should be backed by an index which is what {{option 
(traversal fail)}} indicates. But I don't know much osgi to say "try running 
this query with option traversal fail whenever a new index gets ready... once 
this is done then go on to call activate".

Also, that said, I don't completely agree to the idea of not allowing job 
manager impl at all just because find jobs functionality needs something which 
isn't ready. As [~tmueller] mentioned earlier - first setup is one case but 
imagine a case where the index which "should" answer query via {{findJobs}} is 
corrupt for some reason. Should that result in inability to addJobs as well?

> JobManagerImpl.findJobs should prevent traversal
> 
>
> Key: SLING-8407
> URL: https://issues.apache.org/jira/browse/SLING-8407
> Project: Sling
>  Issue Type: Improvement
>  Components: Event
>Reporter: Thomas Mueller
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The method 
> [JobManagerImpl.findJobs|https://github.com/apache/sling-org-apache-sling-event/blob/master/src/main/java/org/apache/sling/event/impl/jobs/JobManagerImpl.java#L373]
>  runs a JCR query to find all jobs for a topic.
> It is possible that such a query is running while the repository isn't 
> initialized yet, meaning while the index isn't available yet. What is 
> happening in this case is that the query is traversing all nodes below that 
> path, triggering a warning that the query doesn't use an index. It is 
> sometimes happening when a health check is running before the repository is 
> initialized (ReplicationQueueHealthCheck and DistributionQueueHealthCheck).
> It doesn't make sense that the query traverses the nodes. It should use an 
> index. If the index isn't available yet, it should fail. Therefore, the query 
> should use "option(traversal fail)". That would result in an exception that 
> can be caught.  I will log a related issue to change the health checks to 
> process this exception and return HEALTH_CHECK_ERROR for this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SLING-8407) JobManagerImpl.findJobs should prevent traversal

2019-05-10 Thread Vikas Saurabh (JIRA)


[ 
https://issues.apache.org/jira/browse/SLING-8407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16837309#comment-16837309
 ] 

Vikas Saurabh commented on SLING-8407:
--

bq. register a service implementing this API only when the index is ready
"*the* index" - which index would that be. I guess you meant all indexes. To me 
that sounds a bit of an overkill.

bq. The JobManagerImpl would be active only when the index is ready
afaiu, it's the {{findJobs}} method that requires a query. Doing it this way 
means the system won't allow adding jobs as well. I wonder if that's really 
reasonable (to me add jobs should keep on working... status consoles, 
dashboards, healthchecks unavailability is probably more tolerable that 
inability to even add a job.

> JobManagerImpl.findJobs should prevent traversal
> 
>
> Key: SLING-8407
> URL: https://issues.apache.org/jira/browse/SLING-8407
> Project: Sling
>  Issue Type: Improvement
>  Components: Event
>Reporter: Thomas Mueller
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The method 
> [JobManagerImpl.findJobs|https://github.com/apache/sling-org-apache-sling-event/blob/master/src/main/java/org/apache/sling/event/impl/jobs/JobManagerImpl.java#L373]
>  runs a JCR query to find all jobs for a topic.
> It is possible that such a query is running while the repository isn't 
> initialized yet, meaning while the index isn't available yet. What is 
> happening in this case is that the query is traversing all nodes below that 
> path, triggering a warning that the query doesn't use an index. It is 
> sometimes happening when a health check is running before the repository is 
> initialized (ReplicationQueueHealthCheck and DistributionQueueHealthCheck).
> It doesn't make sense that the query traverses the nodes. It should use an 
> index. If the index isn't available yet, it should fail. Therefore, the query 
> should use "option(traversal fail)". That would result in an exception that 
> can be caught.  I will log a related issue to change the health checks to 
> process this exception and return HEALTH_CHECK_ERROR for this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Possibility of making nt:resource unreferenceable

2016-10-04 Thread Vikas Saurabh
> At least in our DocumentStoreImplementations (Mongo and RDB), making the
> UUID something indexed by the storage (so either Mongo or the relational
> database) should be relatively cheap (after all, the UUID never changes once
> assigned, right?). That would eliminate the indexing overhead in Oak itself
> completely.

That'd just shift the storage from collection/table to
index/rdb_index. At least mongo tries to keep its indices in memory,
so the memory overhead would probably remain similar (well, in terms
of order of magnitude at least).

I think avoiding un-necessary uuid and optimizing uuid index are 2
separate problems. Chetan and Thomas, afaics, are discussing the
former.

Thanks,
Vikas


[jira] [Commented] (SLING-4216) Limit the number of vanityPath MapEntry

2014-12-18 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14251612#comment-14251612
 ] 

Vikas Saurabh commented on SLING-4216:
--

bq. That is why the description of 
resource.resolver.vanity.bloom.filter.max.bytes says
Ah.. ok :). Patch looks fine to me.

 Limit the number of vanityPath MapEntry 
 

 Key: SLING-4216
 URL: https://issues.apache.org/jira/browse/SLING-4216
 Project: Sling
  Issue Type: Improvement
  Components: ResourceResolver
Reporter: Antonio Sanso
Assignee: Antonio Sanso
 Attachments: SLING-4216-patch.txt, SLING-4216-patch.txt, 
 SLING-4216-patch.txt


 At the moment there isn't any limit to the number of MapEntry that are cached 
 in memory.
 If the number of vanityPaths/alias is extremely high this can cause OOM.
 It would be good to have a way to limit the amount of memory used by the 
 MapEntry cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SLING-4216) Limit the number of vanityPath MapEntry

2014-12-16 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14248824#comment-14248824
 ] 

Vikas Saurabh commented on SLING-4216:
--

[~asanso], I tried reading through the patch (unfortunately, I couldn't apply 
it on the trunk of my local repo :-/ ... I'm sure that's ignorance on my part 
though :)). Anyway, afaics, size of bloom filter can only be set once and then 
from next time on it'd respect the size according to persisted file length. I 
think if we want to expose filter size as properties, we'd need to clear up old 
persisted data when the properties are updated (or probably when a mismatch is 
detected between property value and file's size). Again, I'm sorry if I might 
have missed it and it's already being taken care of -- browser wasn't the best 
tool to read through the patch :(

 Limit the number of vanityPath MapEntry 
 

 Key: SLING-4216
 URL: https://issues.apache.org/jira/browse/SLING-4216
 Project: Sling
  Issue Type: Improvement
  Components: ResourceResolver
Reporter: Antonio Sanso
Assignee: Antonio Sanso
 Attachments: SLING-4216-patch.txt, SLING-4216-patch.txt, 
 SLING-4216-patch.txt


 At the moment there isn't any limit to the number of MapEntry that are cached 
 in memory.
 If the number of vanityPaths/alias is extremely high this can cause OOM.
 It would be good to have a way to limit the amount of memory used by the 
 MapEntry cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SLING-4216) Limit the number of vanityPath MapEntry

2014-12-12 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14244063#comment-14244063
 ] 

Vikas Saurabh commented on SLING-4216:
--

[~asanso], I just realized that serialization can get the last state safely but 
we'd still need to take care of refreshing it with updates to repository while 
the instance was down. As far I can see, serialized file isn't being used to 
refresh filter yet -- but, I think this would need to be taken care of later.

 Limit the number of vanityPath MapEntry 
 

 Key: SLING-4216
 URL: https://issues.apache.org/jira/browse/SLING-4216
 Project: Sling
  Issue Type: Improvement
  Components: ResourceResolver
Reporter: Antonio Sanso
Assignee: Antonio Sanso
 Attachments: SLING-4216-patch.txt


 At the moment there isn't any limit to the number of MapEntry that are cached 
 in memory.
 If the number of vanityPaths/alias is extremely high this can cause OOM.
 It would be good to have a way to limit the amount of memory used by the 
 MapEntry cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SLING-4216) Limit the number of vanityPath MapEntry

2014-12-12 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14244077#comment-14244077
 ] 

Vikas Saurabh commented on SLING-4216:
--

Sorry, I typed that in a hurry... I meant as far as I see the file isn't being 
used to read the old state back (on start up). I was just trying to say that 
once we implement that -- we'd need to take care of looking out for other 
changes that might have happened in the repository while the instance was down 
(and hence file doesn't have entries for vanity paths that might have got added 
in the mean time).

 Limit the number of vanityPath MapEntry 
 

 Key: SLING-4216
 URL: https://issues.apache.org/jira/browse/SLING-4216
 Project: Sling
  Issue Type: Improvement
  Components: ResourceResolver
Reporter: Antonio Sanso
Assignee: Antonio Sanso
 Attachments: SLING-4216-patch.txt


 At the moment there isn't any limit to the number of MapEntry that are cached 
 in memory.
 If the number of vanityPaths/alias is extremely high this can cause OOM.
 It would be good to have a way to limit the amount of memory used by the 
 MapEntry cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SLING-4216) Limit the number of vanityPath MapEntry

2014-12-12 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14244094#comment-14244094
 ] 

Vikas Saurabh commented on SLING-4216:
--

bq. And I did actually did not change my mind
:)
bq. At the end of the day IMHO if a situation like you said will happen 
(possible but unlikely) it is better to rebuild the bloomFilter file from 
scratch
I agree although even finding that such a change has happened is equally 
expensive (from repo read point of view)

 Limit the number of vanityPath MapEntry 
 

 Key: SLING-4216
 URL: https://issues.apache.org/jira/browse/SLING-4216
 Project: Sling
  Issue Type: Improvement
  Components: ResourceResolver
Reporter: Antonio Sanso
Assignee: Antonio Sanso
 Attachments: SLING-4216-patch.txt


 At the moment there isn't any limit to the number of MapEntry that are cached 
 in memory.
 If the number of vanityPaths/alias is extremely high this can cause OOM.
 It would be good to have a way to limit the amount of memory used by the 
 MapEntry cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (SLING-4216) Limit the number of vanityPath MapEntry

2014-12-12 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/SLING-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14244094#comment-14244094
 ] 

Vikas Saurabh edited comment on SLING-4216 at 12/12/14 1:11 PM:


bq. the reason why I was hesitating to add bloom filter in the past was really 
to avoid situations like this.
BTW, I think using bloom filter is still useful... it's just that serializing 
to file might not be trivial.
bq. And I did actually did not change my mind
:)
bq. At the end of the day IMHO if a situation like you said will happen 
(possible but unlikely) it is better to rebuild the bloomFilter file from 
scratch
I agree although even finding that such a change has happened is equally 
expensive (from repo read point of view)


was (Author: catholicon):
bq. And I did actually did not change my mind
:)
bq. At the end of the day IMHO if a situation like you said will happen 
(possible but unlikely) it is better to rebuild the bloomFilter file from 
scratch
I agree although even finding that such a change has happened is equally 
expensive (from repo read point of view)

 Limit the number of vanityPath MapEntry 
 

 Key: SLING-4216
 URL: https://issues.apache.org/jira/browse/SLING-4216
 Project: Sling
  Issue Type: Improvement
  Components: ResourceResolver
Reporter: Antonio Sanso
Assignee: Antonio Sanso
 Attachments: SLING-4216-patch.txt


 At the moment there isn't any limit to the number of MapEntry that are cached 
 in memory.
 If the number of vanityPaths/alias is extremely high this can cause OOM.
 It would be good to have a way to limit the amount of memory used by the 
 MapEntry cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Limit the number of vanityPath MapEntry

2014-12-08 Thread Vikas Saurabh
 Since the cache initialization was a separate thread I had to take care on 
 properly handle this situation and updating properly the bloomfilter.

Sorry for being a little slow to catch up. But, as far as I understood
the issue -- it is about that we need to load all vanity mappings
during startup to declare that the instance is indeed up. I thought it
implied that until the thread that loads/caches/marks all vanity paths
finishes its task, startup isn't marked as complete (please excuse the
different terms -- I'm not sure of the exact usages in sling)

 With this new approach of limiting the vanity Path entries size I think the 
 Bloom Filter can come back in the solution :)

I think even with bloom filter (assuming we don't serialize its state
back on clean shutdowns), we'd have to read 'all' vanity mappings once
during startup. Bloom filter would just use less memory to mark if we
really have mapping in repo (even when there is a miss on cached
vanity entry)

I think SLING-3290 is about reducing number of MapEntries required to
load up all vanity paths... but, I think SLING-4216 is more to do with
how many of vanity entries would we like to keep in memory -- which I
think can be solved with some cache implementation (probably LRU???) +
bloom filter(to tell if going back to repo would be useful or not)

Thanks,
Vikas


Re: Limit the number of vanityPath MapEntry

2014-12-05 Thread Vikas Saurabh
 - do a search for a vanityPath if the /requestPath is not in the cache (as 
 anticipated above I think this is not a great solution since it can be 
 expensive, indeed I assume the majorities of requests are not even 
 “vanityPath requests” so both the cache and the search will return empty 
 result)

I think the major concern here is that (a lot of) non-vanity path
resources would still have to go to repo for lookup. Would a bloom
filter[1] implementation be possible to help taking that decision?

Thanks,
Vikas
[1]: http://en.wikipedia.org/wiki/Bloom_filter


Re: Limit the number of vanityPath MapEntry

2014-12-05 Thread Vikas Saurabh
 we already took in consideration Bloom Filter for a related issue [2].
 We decided that is still not too optimal since it leads toward content 
 duplication and I would like to avoid that for now

 [2] https://issues.apache.org/jira/browse/SLING-3290


Well, imho, bloom filters won't duplicate content -- they'd just have
bit-masks to tentatively mark existence of a value. Moreover, if we
use guava's implementation (which I think sling doesn't want to do...
if I am reading SLING-3290 correctly), then we can serialize them on
clean shutdown to have practically no work done during startup. For
crashes, we can probably live with re-creating the filter again.

About, BloomFilterUtils attached in SLING-3290, I think it's just
using 1 hash function to create mask. In general, bloom filter
implementation would have more number of hashes to configure less
false-positives.

About caching actual data in RAM (and assuming sling would sit on top
of Oak??) -- should caching of most used nodes be a responsibility of
repository implementation?.. but, that's probably a different
discussion.

Thanks,
Vikas


[jira] [Created] (SLING-3245) TenantAdapterFactory can return incorrect 'null' tenants depending on order of pathMatchers

2013-11-15 Thread Vikas Saurabh (JIRA)
Vikas Saurabh created SLING-3245:


 Summary: TenantAdapterFactory can return incorrect 'null' tenants 
depending on order of pathMatchers
 Key: SLING-3245
 URL: https://issues.apache.org/jira/browse/SLING-3245
 Project: Sling
  Issue Type: Bug
  Components: Extensions
Affects Versions: Tenant 1.0.0
Reporter: Vikas Saurabh
Priority: Minor


For a case where users under /home/users/abc/tenant_name/ with an existing 
tenant tenant_name (assuming pathMatchers have a correct regex to be matched) 
should be identified with correct tenant without relying on order of entries in 
pathMatchers.
* Of course, the assumption is that pathMatchers can match a tenant name 
uniquely
* e.g. pathMathers={/home/users/([^/]+)/*, /home/users/abc/([^/]+)/*} should 
work



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (SLING-3245) TenantAdapterFactory can return incorrect 'null' tenants depending on order of pathMatchers

2013-11-15 Thread Vikas Saurabh (JIRA)

 [ 
https://issues.apache.org/jira/browse/SLING-3245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Saurabh updated SLING-3245:
-

Attachment: fix_early_tenantQuery_exit.patch

patch for the fix of the issue

 TenantAdapterFactory can return incorrect 'null' tenants depending on order 
 of pathMatchers
 ---

 Key: SLING-3245
 URL: https://issues.apache.org/jira/browse/SLING-3245
 Project: Sling
  Issue Type: Bug
  Components: Extensions
Affects Versions: Tenant 1.0.0
Reporter: Vikas Saurabh
Priority: Minor
 Attachments: fix_early_tenantQuery_exit.patch


 For a case where users under /home/users/abc/tenant_name/ with an existing 
 tenant tenant_name (assuming pathMatchers have a correct regex to be matched) 
 should be identified with correct tenant without relying on order of entries 
 in pathMatchers.
 * Of course, the assumption is that pathMatchers can match a tenant name 
 uniquely
 * e.g. pathMathers={/home/users/([^/]+)/*, /home/users/abc/([^/]+)/*} 
 should work



--
This message was sent by Atlassian JIRA
(v6.1#6144)