Re: New Jackrabbit Committer: Mohit Kataria

2019-08-25 Thread Dominik Süß
Welcome Mohit

Cheers
Dominik

Chetan Mehrotra  schrieb am Do. 22. Aug. 2019 um
06:21:

> Welcome Mohit!
> Chetan Mehrotra
>
> On Thu, Aug 15, 2019 at 7:26 PM Matt Ryan  wrote:
> >
> > Welcome Mohit!
> >
> >
> > -MR
> >
> > On Thu, Aug 15, 2019 at 6:51 AM Julian Sedding 
> wrote:
> >
> > > Welcome Mohit!
> > >
> > > Regards
> > > Julian
> > >
> > > On Wed, Aug 14, 2019 at 3:25 PM Woonsan Ko  wrote:
> > > >
> > > > Welcome, Mohit!
> > > >
> > > > Cheers,
> > > >
> > > > Woonsan
> > > >
> > > > On Wed, Aug 14, 2019 at 2:31 AM Tommaso Teofili
> > > >  wrote:
> > > > >
> > > > > Welcome to the team Mohit!
> > > > >
> > > > > Regards,
> > > > > Tommaso
> > > > >
> > > > > On Thu, 8 Aug 2019 at 08:33, Marcel Reutegger 
> > > wrote:
> > > > >>
> > > > >> Hi,
> > > > >>
> > > > >> Please welcome Mohit Kataria as a new committer and PMC member of
> > > > >> the Apache Jackrabbit project. The Jackrabbit PMC recently
> decided to
> > > > >> offer Mohit committership based on his contributions. I'm happy to
> > > > >> announce that he accepted the offer and that all the related
> > > > >> administrative work has now been taken care of.
> > > > >>
> > > > >> Welcome to the team, Mohit!
> > > > >>
> > > > >> Regards
> > > > >>  Marcel
> > > > >>
> > >
>


Re: New Jackrabbit Committer: Nitin Gupta

2019-08-25 Thread Dominik Süß
Welcome Nitin!
Cheers Dominik

Chetan Mehrotra  schrieb am Do. 22. Aug. 2019 um
06:19:

> Welcome Nitin!
>
> Chetan Mehrotra
>
> On Thu, Aug 15, 2019 at 7:26 PM Matt Ryan  wrote:
> >
> > Welcome Nitin!
> >
> >
> > -MR
> >
> > On Thu, Aug 15, 2019 at 6:51 AM Julian Sedding 
> wrote:
> >>
> >> Welcome Nitin!
> >>
> >> Regards
> >> Julian
> >>
> >> On Wed, Aug 14, 2019 at 3:24 PM Woonsan Ko  wrote:
> >> >
> >> > Welcome, Nitin!
> >> >
> >> > Cheers,
> >> >
> >> > Woonsan
> >> >
> >> > On Wed, Aug 14, 2019 at 2:30 AM Tommaso Teofili
> >> >  wrote:
> >> > >
> >> > > Welcome to the team Nitin!
> >> > >
> >> > > Regards,
> >> > > Tommaso
> >> > >
> >> > > On Thu, 8 Aug 2019 at 08:31, Marcel Reutegger 
> wrote:
> >> > >>
> >> > >> Hi,
> >> > >>
> >> > >> Please welcome Nitin Gupta as a new committer and PMC member of
> >> > >> the Apache Jackrabbit project. The Jackrabbit PMC recently decided
> to
> >> > >> offer Nitin committership based on his contributions. I'm happy to
> >> > >> announce that he accepted the offer and that all the related
> >> > >> administrative work has now been taken care of.
> >> > >>
> >> > >> Welcome to the team, Nitin!
> >> > >>
> >> > >> Regards
> >> > >>  Marcel
> >> > >>
>


Re: Content Package - Check for new index definitions

2019-07-25 Thread Dominik Süß
Hi Konrad,

rep:policy should for sure not cause issues - but ACLs are orthogonal to
the index definitions. IIrc stuff like excludePath etc may be defined
bellow and changes in there should fail.

Do option D would be  to just exclude the ACL handling which anyhow is
handled quite specifically with own merge modes etc.

Cheers
Dominik

Konrad Windszus  schrieb am Mi. 24. Juli 2019 um 12:13:

> Hi,
> the filevault-package-maven-plugin validates if a package contains a new
> index definition. This behaviour can be disabled via the
> "allowIndexDefinitions" parameter (
> https://jackrabbit.apache.org/filevault-package-maven-plugin/generate-metadata-mojo.html#allowIndexDefinitions)
> . There was a bug recently reported about that at
> https://issues.apache.org/jira/browse/JCRVLT-343?focusedCommentId=16891508=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16891508
> because there was a rep:policy node modified below oak:index. It is not
> 100% clear under which circumstances the validation should fail:
> a) only for synchronous index definition nodes
> b) for all index node definitions
> c) for all changes below /oak:index
>
> I tend to say actually only a) might lead to issues and therefore only
> those nodes should make the build optionally fail.
> WDYT?
>
> Thanks,
> Konrad


Re: New Jackrabbit Committer: Dominik Süß

2019-07-25 Thread Dominik Süß
Hello everyone,

Like Konrad I wanted to thank a lot for the invitation.


Here a short version about my own background. I started working as
Integrator for AEM/CQ and that way getting in touch with Jackrabbit in 2007
and became an active member mostly of the Sling Community soon after.   In
2015 I joined AEM engineering and by that rather worked more on the details
of the stack and began to contribute once in a while.

Since a few years my focus is mostly around deployment aspects as content
that links directly to application that may change over time or the
installation and necessary transformation of content over time without
having negative impact on the availability  of a system.


 I share Konrads interest in filevault but also correlated topics such as
composite node-store, oak-upgrade and any other mechanism that link to
automation of changes in Jackrabbit.


Cheers

Dominik

On Thu, Jul 25, 2019 at 4:02 PM Woonsan Ko  wrote:

> Welcome, Dominik!
>
> Cheers,
>
> Woonsan
>
> On Thu, Jul 25, 2019 at 9:54 AM Marcel Reutegger 
> wrote:
> >
> > Hi,
> >
> > Please welcome Dominik Süß as a new committer and PMC member of
> > the Apache Jackrabbit project. The Jackrabbit PMC recently decided to
> > offer Dominik committership based on his contributions. I'm happy to
> > announce that he accepted the offer and that all the related
> > administrative work has now been taken care of.
> >
> > Welcome to the team, Dominik!
> >
> > Regards
> > Marcel
>


Re: New Jackrabbit Committer: Dominik Süß

2019-07-25 Thread Dominik Süß
Hello everyone,

Like Konrad I wanted to thank a lot for the invitation.


Here a short version about my own background. I started working as
Integrator for AEM/CQ and that way getting in touch with Jackrabbit in 2007
and became an active member mostly of the Sling Community soon after.   In
2015 I joined AEM engineering and by that rather worked more on the details
of the stack and began to contribute once in a while.

Since a few years my focus is mostly around deployment aspects as content
that links directly to application that may change over time or the
installation and necessary transformation of content over time without
having negative impact on the availability  of a system.


 I share Konrads interest in filevault but also correlated topics such as
composite node-store, oak-upgrade and any other mechanism that link to
automation of changes in Jackrabbit.


Cheers

Dominik

On Thu, Jul 25, 2019 at 4:02 PM Woonsan Ko  wrote:

> Welcome, Dominik!
>
> Cheers,
>
> Woonsan
>
> On Thu, Jul 25, 2019 at 9:54 AM Marcel Reutegger 
> wrote:
> >
> > Hi,
> >
> > Please welcome Dominik Süß as a new committer and PMC member of
> > the Apache Jackrabbit project. The Jackrabbit PMC recently decided to
> > offer Dominik committership based on his contributions. I'm happy to
> > announce that he accepted the offer and that all the related
> > administrative work has now been taken care of.
> >
> > Welcome to the team, Dominik!
> >
> > Regards
> > Marcel
>


Re: Filevault Package Manager HTTP API

2018-07-23 Thread Dominik Süß
Hi all,

I just wanted to add that I already started to work on a complimentary
aspect of the registry which is a FS based registry (
https://issues.apache.org/jira/browse/JCRVLT-180) but limited to the
application package aspect (so only focused on extraction (so no snapshots)
without any uninstall scenario.

Thinking end to end this is about efficiently and deterministically
building up the immutable (application) part that can then be used in a
composite nodestore scenario with optimal footprint (eg for docker layers)

Looking at the numbers so far this already looks like dropping a lot of
overhead that’s not necessary for the given scenario but the majority of
time is still being spent on building up and commiting the changes to the
nodestore. So I would be more then happy to join forces with someone being
able to to Executionplan and the interaction with the nodestoreto
overcome the current bottlenecks (beyond the topics and challenges already
mentioned by Toby.

Cheers
Dominik

Tobias Bocanegra  schrieb am Mo. 23. Juli 2018 um 11:03:

> Hi,
>
> I recently started to write a HTTP API for the filevault package manager
> [0]. The current state covers maybe 20% of the functionality. I use a
> annotation based "framework" to describe the models which then are rendered
> into siren+json format. I also used postman [1] to write tests which are
> also executed via `npm test`. The good thing about postman is, that it
> allows to generate some documentation [2].
>
> Most of the work I've done so far is to setup the framework and to
> implement basic CRUD operations. The more challenging bits, like
> install/uninstall are still missing. The big problem for the long-running
> operations is that they should be async. Otherwise the client has to keep
> the connection open in order to observer the progress. Therefor I would
> also like to introduce some task-based mechanism for install/uninstall [3].
>
> Beyond the HTTP API, there is also pending work to be done on the new
> Package Registry which has some cool new features like defining an
> execution plan which allows to install multiple packages "at once". I think
> this would align well with the task-based mechanism mentioned above (maybe
> even be the only way the HTTP api would use the core).
>
> Currently my designated time to work on filevault is very small, so I'm
> looking for people who are interested to work on this.
>
> It would be great for the filevault project to finally have its own
> package manager HTTP API (and maybe even a UI ;-)
> So let me know of you are interested and we can start discussing the
> approach and creating issues/tasks etc.
>
> Regards, Toby
>
> [0]
> https://github.com/tripodsan/jackrabbit-filevault/tree/vault-packagemgr/vault-packagemgr
> [1] https://www.getpostman.com/
> [2] https://documenter.getpostman.com/view/3704157/RVu2mVNL
> [3] https://issues.apache.org/jira/browse/JCRVLT-151
>
>
>


Re: new name for the multiplexing node store

2017-05-10 Thread Dominik Süß
Naming discussions - love it (where is my popcorn? ;) )

I would think that something with Compositing might be suitable as this is
about composition of something that works as as final result but the
artifacts might not be useful on their own.

Cheers
Dominik

Am 05.05.2017 20:40 schrieb "Robert Munteanu" :

Hi,

On Fri, 2017-05-05 at 07:18 -0600, Matt Ryan wrote:
> I was wondering about this also WRT federated data store.  If the
> intent
> and effect of both are the same ("both" meaning what is currently
> called
> the "multiplexing node store" and the proposed (and in-progress)
> "federated
> data store"), it seems they should use a similar naming convention at
> least.
>
> WDYT?  Does that make it more confusing or less confusing?

I think the high-level intent is the same for both - compose a single
{Data,Node}Store out of multiple sub-stores.

The mechanisms might be different though, as the the NodeStore is
hierarchical in nature, while the BlobStore blob ids are opaque.

Also I still maintain :-) that federated blob stores will work well
individually as they have no overall hierarchy to respect, while the
multiplexed node stores will have to be composed to create a meaningful
image.

Robert

>
> -MR
>
> On Fri, May 5, 2017 at 6:10 AM, Julian Sedding 
> wrote:
>
> > Hi Tomek
> >
> > In all related discussions the term "mount" appears a lot. So why
> > not
> > Mounting NodeStore? The module could be "oak-store-mount".
> >
> > Regards
> > Julian
> >
> >
> > On Fri, May 5, 2017 at 1:39 PM, Tomek Rekawek 
> > wrote:
> > > Hello oak-dev,
> > >
> > > the multiplexing node store has been recently extracted from the
> >
> > oak-core into a separate module and I’ve used it as an opportunity
> > to
> > rename the thing. The name I suggested is Federated Node Store.
> > Robert
> > doesn’t agree it’s the right name, mostly because the “partial”
> > node
> > stores, creating the combined (multiplexing / federated) one, are
> > not
> > usable on their own and stores only a part of the overall
> > repository
> > content.
> > >
> > > Our arguments in their full lengths can be found in the OAK-6136
> > > (last
> >
> > 3-4 comments), so there’s no need to repeat them here. We wanted to
> > ask you
> > for opinion about the name. We kind of agree that the
> > “multiplexing” is not
> > the best choice - can you suggest something else or maybe you think
> > that
> > “federated” is good enough?
> > >
> > > Thanks for the feedback.
> > >
> > > Regards,
> > > Tomek
> > >
> > > --
> > > Tomek Rękawek | Adobe Research | www.adobe.com
> > > reka...@adobe.com
> > >


Re: Oak JCR Observation scalability aspects and concerns

2013-10-22 Thread Dominik Süß
Hi :)

Speaking as developer using the Sling eventing I just wanted to add that in
most cases there are restrictions on Paths (most times not just one but
multiple searchpaths) and on a resourceType (not just exact match but a
set or pattern to identify a set of resourceTypes) and in some occasions
further constraints like existance or a specific value of a specific
property. Currently it is up to the user to do this check within the
EventListener, but I think it would be feasible to register a listener that
defines those checks that can be processed by the implementation at a low
level. But it would be good to ask around in the sling community how these
events are used in production to be sure not to miss essential patterns.

Cheers
Dominik


On Tue, Oct 22, 2013 at 4:20 PM, Jukka Zitting jukka.zitt...@gmail.comwrote:

 Hi,

 On Tue, Oct 22, 2013 at 9:59 AM, Felix Meschberger fmesc...@adobe.com
 wrote:
  That's one Event object per event -- not one event per listener per
 event.
  This is completely different to JCR.

 You're mistaking the problem here, it's not the number of listeners,
 it's the number of events *per listener*.

 What we're looking at here is scaling out to write loads that could
 well have over a million changed items per second. On my laptop just
 instantiating a dummy Event object takes a few hundred nanoseconds, so
 there's no way to process millions of them per second in a single
 listener.

 BR,

 Jukka Zitting



Re: Oak JCR Observation scalability aspects and concerns

2013-10-22 Thread Dominik Süß
+1 on 4 since I fear 3 will create some overhead for existing solutions
that won't need this kind of scalabilty (and therefore create uncessary
efforts for migration).  This is the old compat pattern seen so often.

 IMHO this should be an extension that can be installed but is not
available by default (to force devs to decide on that but being lazy and
not care about deprecation).


On Tue, Oct 22, 2013 at 4:30 PM, Carsten Ziegeler cziege...@apache.orgwrote:

 I really would like to have a constructive discussion here. I think the
 Sling use case is pretty well explained now - that's an api Sling offers
 and which is used by a lot of code out there (a great part of Sling is
 based on the OSGi events and layers on top of Sling are using it as well).
 That's a fact and it's also a fact that listeners for the OSGi event
 usually listener for all events.

 Now basically we have three/four options:
 1. we leave everything as is - it works but might be slow with larger
 installations and heavy writes
 2. we maintain the API as-is in Sling and try to make the implementation as
 fast as possible
 3. we break compatibility in Sling, find a better solution, rewrite parts
 of Sling and require all downstream users to rewrite their stuff
 well, the fourth option would be
 4. same as 3. but keep the old Sling API with a bold marker when it's used
 that this does not scale

 For the sake of compatibility I really would like to go with 2 which might
 require changes in Sling and Oak but sounds to me as the best compromise.
 In addition, it really would help the discussion if we would have
 performance tests showing us the real boundaries in terms of scalability
 with observation with some real figures.

 Thanks
 Carsten
 --
 Carsten Ziegeler
 cziege...@apache.org



Re: Oak JCR Observation scalability aspects and concerns

2013-10-22 Thread Dominik Süß
I just opened a thread at sling-dev for further discussion about api and
implementation changes on sling side [0]

For discussions around usage of this api within sling please use this
linked thread [0].

Best regards
Dominik


[0] markmail.org/thread/plb7ledhsna33r3g


On Tue, Oct 22, 2013 at 4:54 PM, Jukka Zitting jukka.zitt...@gmail.comwrote:

 Hi,

 On Tue, Oct 22, 2013 at 10:39 AM, Carsten Ziegeler cziege...@apache.org
 wrote:
  Just to reiterate :) if we go with 3 or 4, someone has to do the work in
  Sling (and other places) and adapt the code. As obviously as soon as a
  single listener is using the old pattern, the whole mechanism is mood.

 I think we can lay the groundwork with tools like the ones outlined in
 this thread, and postpone much of the required refactoring work to
 when we do have the appropriate benchmarks in place and a use case
 where such scale is needed in practice. At that point we can also make
 a more reasoned judgement of whether option 2 or 4 is the better
 solution for that particular case, i.e. are we still within such scale
 that normal optimization is good enough and broader design changes
 aren't needed. And we'll also have someone with a bad enough itch to
 scratch.

 As for 3 vs. 4, I think option 3 is clearly unworkable, as there's no
 immediate need to break backwards compatibility for most normal
 deployments. I'd go for option 4 of keeping the current mechanism with
 a note detailing the scalability issue and instructions on how to
 prepare for avoiding it. Note that option 4 is compatible with 2, as
 we can proceed on both fronts concurrently.

 BR,

 Jukka Zitting



Re: Inconsistent behavior upon moving nodes (was: Re: When moving a Tree, can it die?)

2013-02-07 Thread Dominik Süß
Sorry for jumping in without having the overall picture.
Is it really a good idea to use the path as identifier? For Referencable
Identifiers (25.1) it is pretty clear that an ID is structureindepenent, so
why should a nonreferencable be bound to a path. Since a node can return
its path I do not see a reason why the identifier should be the same. I'd
assume that someone implementing on that base then would now that ID is
something reliable when having the move case, while path is really
something structurespecific. By a combination of both you can be sure to
handle the same entity.

Best regards
Dominik



On Wed, Feb 6, 2013 at 9:43 PM, Jukka Zitting jukka.zitt...@gmail.comwrote:

 Hi,

 On Wed, Feb 6, 2013 at 8:04 PM, Marcel Reutegger mreut...@adobe.com
 wrote:
  They are easy to fix but very difficult to detect and diagnose. No
 exception is thrown,
  just the behavior is unexpected. I think this is quite dangerous, since
 the spec is
  IMO quite clear about this
 
 http://www.day.com/specs/jcr/2.0/10_Writing.html#10.11.7%20Reflecting%20Item%20State

 That's not as straightforward as it sounds. For example, consider the
 following sequence:

 Node a = session.getNode(/foo);
 String id = a.getIdentifier();

 session.move(/foo, /bar);
 session.getRootNode().addNode(foo);

 Node b = session.getNodeByIdentifier(id);
 assert a.isSame(b); // ???

 Should the last assertion pass or not? Interestingly it passes with
 both Jackrabbit 2.x and Oak, even though they return different values
 for b.getPath(). They're both correct!

 According to sections 10.11.7 and 10.11.4, item identity in such cases
 is determined by the item identifier, which means that a.isSame(b)
 should always be true. Thus, for implementations like Oak, that use
 the item path (up to root or the first referenceable ancestor) as the
 identifier of non-referenceable nodes, the effect of the above move()
 call should therefore be as if a.remove() had been called as otherwise
 the identity of a would change. As a consequence, b.getPath() should
 return /foo. In Jackrabbit 2.x, where each node has a unique
 non-path identifier, the move() call would change the path of a and
 result in b.getPath() returning /bar.

 The above rationale would imply that in Oak is to make sure that all
 session refreshes and transient moves should trigger re-evaluation of
 the the paths of referenceable nodes and those with a referenceable
 ancestor. Other nodes should keep behaving as they currently do.

 BR,

 Jukka Zitting



Re: Conflict handling in Oak

2012-12-21 Thread Dominik Süß
Hi, although I did not have the opportunity to jump in as planned I'm still
following changes and had some thoughts about that as well.

On Tue, Dec 18, 2012 at 4:51 PM, Michael Dürig mdue...@apache.org wrote:


 Right. However, degrading moves to remove/add node operations limits the
 size of sub trees which can be moved: if the moved sub tree (serialised to
 json add node operations) do not fit into heap, moves wont work at all.


When reading about splitting up a move operation in an add and a remove
operation I realized that automatic mergin might  lead to strange
constellations. If a node is moved and the creation did work while the
removal ended in a conflict (worst case: to concurrent moves) the create
would be performed without any annotation, so it is not that easy to
perform a client side resolution  without the awareness of a dedicated move
operation.

IMHO a commit that needs a merge should always fail but return the
necessary diff-information for client side resolution. This information
could either be in a way that allow the client just to accept a
mergeproposal or for unresolvable clients having the oportunity of
implementing a mine-theirs resolution logic. That way it is up to the
client to define how relaxed the system should behave. The (core) API could
even give the tooling to autoaccept or sequentially accept mergeproposals,
so the remaining implementationeffort for a client/binding can be lowered.

Hope I did get everything right and my thoughts do not irritate to much.

Best regards
Dominik


Re: adaptTo(Berlin) 2012

2012-06-01 Thread Dominik Süß
Dear Apache Jackrabbit users and developers,

as Michael allready announced we'll organise a 2nd Edition of
.adapto(Berlin) this september.

Based on the feedback of the attendees from last years .adaptto()
we're trying to change some things.
The most interessting point for you might be the range of Topics:

Last year we did focus on talks that were based on the sling stack.
For this year we also would like to open it for sessions focused on
JCR without having the direct connection to
Sling. With this step we hope to bring the communities of Apache
Jackrabbit, Apache Felix and Apache Sling closer together and learn
from each other.

Here are some aditional things we'll have to / want to change:

- Location
Since Betahaus on the one hand is not available at the given dates,
and on the other hand there was some room for improvements regarding
the location (like space for more pariticipants), we had to relocate
the event. We did find a location which fullfills all requirements
without losing the industrial style we had in Betahaus last year. As
soon as the last details have been clarified we'll anounce the
location.

- Length
As some of you might have noticed, this years edition is three instead
of the two days from last year. The reason for that is that we would
like to give you more time for interaction and also give 'newcomers'
the option to learn some basics of the frameworks on day one.

- Conference-Style
Based on the feedback we would love to see more interactive sessions
and workshops. But since this is again a community driven event,
everybody is more then welcome to discuss which topics might be worth
to be adressed and which style would be a matching  (workshop, front
session, hands-on session, lightning talk, panel discussion). We
really would like to have an active discussion here in the
mailinglists and hear from you what you would like to do or to know.


We'd be happy to get some feedback of your thoughts about this
conference. Feedback from last years participants as well as ideas an
proposals from any potential participant for this year.

Best regards,
Dominik

On Tue, May 22, 2012 at 10:46 PM, Michael Dürig mdue...@apache.org wrote:
 Dear Apache Jackrabbit developers and users,

 after last year's edition (http://adaptto.mixxt.de/) which got very
 good feedback from attendees, Provision and Adobe would like to
 organize and invite you to an event focused on our technologies, in
 Berlin, Germany, on September 26th-28th, called .adaptTo(Berlin) 2012.

 A brief description of the event follows below; please note that this
 is kept in a draft fashion to let the communities (Sling, Jackrabbit)
 shape its final content and organization. So please let us know if you
 are interested, what suggestions and what questions you would like to
 post, and, most of all: what you want to present or you would like to
 see presented at .adaptTo(Berlin)

 This is a brief description which we also put up on the web site:

 .adaptTo(Berlin) is a technical meetup focused on the technical stack
 of Apache Sling including Apache Jackrabbit and Apache Felix and is
 adressed to all developers using this stack or parts of it. Specific
 content will be focused on Adobe CQ5, the commercial WCM system whose
 infrastructure is based on the Sling stack and the commercial JCR
 Implementation Adobe CRX.

 This event is sponsored by Adobe Systems and pro!vision GmbH which
 also acts as host.

 The goal of this meetup is the consolidation of the experience and
 knowledge of the existing Apache Sling, Apache Jackrabbit and Apache
 Felix community and to introduce the complete Sling stack to
 newcomers. To achive this goal we'll try to bring developers and users
 (which means developers who work on top of this framework) together,
 talking about the framework and presenting solutions or best
 practices. Attendees that have not much experience with one or more
 parts of this stack will get the chance to get a kickstart
 introduction to Apache Jackrabbit, Apache Felix and Apache Sling on
 the first day.

 Day 1 of this meetup will therefore give unexperienced attendees the
 possiblity to learn the basic principles of this technological stack,
 while experienced users can set up free meetups in the style of a
 barcamp.

 Day 2  3 will include session on the open source Apache projects,
 Apache Sling, Apache Felix and Apache Jackrabbit, as well as some
 breakout sessions focused on Adobe CQ5 and CRX. The idea is to have a
 good mix of presentations, hands-on sessions, free discussions and
 maybe even workshops or a hackaton.

 Note: This is only a draft how we think this event could work, but
 since this is a community driven event, further planning and
 discussion of the contents of the meeting will be continued in the
 public mailinglists.

 where/when:
 Berlin, September 26-28 2012

 Target audience:
 Developers working with Sling, Jackrabbit, CQ and other related projects.

 Michael


Re: Semantic distance search

2009-06-06 Thread Dominik Süß
Thanks for this hint.
So I propably should think about handling this search without lucene and use
jcr-api for it or combine it (for the tagging information which should be
correct and should be used since there is no real navigation since the
coupling for this way of tagging is based on lose references).
I did hope I could use the lucene index for all of it to have a better
performance but I think it's worth trying it the other way - now I have to
figure out how to combine my score with the lucene score.

Best regards,
Dominik

On Fri, Jun 5, 2009 at 9:24 PM, Alexander Klimetschek aklim...@day.comwrote:

 No, the search will work, because the path information is not stored in the
 lucene index - hence no reindex is needed upon a move - and path location
 steps are handled without the lucene index.

 Regards,
 Alex

 --Alexander Klimetschek @iPhone


 Am 05.06.2009 um 11:58 schrieb Dominik Süß dominik.su...@gmail.com:

 Hi Marcel,

 doesn't that mean I never can be sure I'll get a proper result when
 searching for the path of a node?

 Best regards,
 Dominik

 On Tue, Jun 2, 2009 at 1:43 PM, Marcel Reutegger marcel.reuteg...@gmx.net
 marcel.reuteg...@gmx.net wrote:

 Hi,

 2009/5/21 Dominik Süß  dominik.su...@gmail.comdominik.su...@gmail.com
 :
  Hi everybody,
 
  after having some time of indirect contact with JCR throught sling and
 day
  crx/cq I now think it's time to get in touch with jackrabbit directly.
 As
  the subject says I do this after having an idea which I'd like to share
 and
  need some help to realize (since my lucene experiences are close to
 nothing
  but pure usage  theory). I did try to start with a proof of concept but
 as
  I looked in the current implementations of search in jcr I had to
 realize I
  need someone who could give me a jumpstart and does the first steps
 together
  with me. So here I go with my idea:
 
  I recently had some thoughts about something I'd call sementic distance
 in
  multidimensional hierachies (content structures + hierarchical tagging
 like
  in CQ 5 [1]).
 
  The task I would like to fullfill: Find the semantically closest nodes
 for a
  given node.
 
  I postulate that structure represents the semantic relation as well as
 the
  referenced tags are in a hierarchie that represents semantic relations.
  Furthermore I postulate subnodes are semanticaly a subset of the type
 of
  the parentnode (not thinking of jcr-types but in semantical
 classifications)
  This leads into the following thesis: The distance to the closest shared
  parentnode represents the unidirectional distance of a node to another
 node.
  The result is that a whole branch has the same distance to a node.
 (which
  should be correct since the subnode in the branch belongs to the parent
 node
  which connects the branches we have to look at).
 
  My try to figure out a good way to produce an index for this really
 seams to
  be hard so I rethought my assumptions and came up with the following way
 of
  determining the distance without indexing the explicit distance (came up
  with this thought after reading a bit about the Analyzers and Stemming).
 
  1. For indexing all referenced taghandles and the own handle will be
 taken
  into account for indexing
  2. an analyzer produces stringtokens out of each handle. Each handle
 will be
  split up in multiple handles by removing the last node till the rootnode
 is
  reached (so the node and every parentnode is indexed for this node as
 well
  as for each referenced tag)

 this will only work as long as you don't move nodes. moving a node in
 jackrabbit is a light weight operation, which means only the moved
 node is re-indexed. all descendant nodes are kept untouched even
 though their path (handle) changed!

 regards
  marcel

  3. The query has to built based on a given handle since I want to search
 for
  the semantically closest nodes.
  4. The query is built the same way as the Analyzer has to split the
 handle
  in all parent handles.
  Result: A 100% match can only be produced for the same node (for all
 other
  nodes at least the own handle of the node is missing). The
 semantically
  closer a node is the more handles will match wich will result in an
 ordering
  as I intended. Et Voilá we have all we need to search for search
  semantically close pages in a proper sorting order.
 
  I might have a gap in my conclusions but didn't realise it yet, Id love
 to
  have some feedback and would appreciate some help to get startet with
 the
  mentioned proof of concept.
 
  WDYT?
 
  Best regards,
  Dominik
 
  [1] http://dev.day.com/microsling/content/blogs/main/cq5tags.html
 http://dev.day.com/microsling/content/blogs/main/cq5tags.html
 





Re: Semantic distance search

2009-06-05 Thread Dominik Süß
Hi Marcel,

doesn't that mean I never can be sure I'll get a proper result when
searching for the path of a node?

Best regards,
Dominik

On Tue, Jun 2, 2009 at 1:43 PM, Marcel Reutegger
marcel.reuteg...@gmx.netwrote:

 Hi,

 2009/5/21 Dominik Süß dominik.su...@gmail.com:
  Hi everybody,
 
  after having some time of indirect contact with JCR throught sling and
 day
  crx/cq I now think it's time to get in touch with jackrabbit directly. As
  the subject says I do this after having an idea which I'd like to share
 and
  need some help to realize (since my lucene experiences are close to
 nothing
  but pure usage  theory). I did try to start with a proof of concept but
 as
  I looked in the current implementations of search in jcr I had to realize
 I
  need someone who could give me a jumpstart and does the first steps
 together
  with me. So here I go with my idea:
 
  I recently had some thoughts about something I'd call sementic distance
 in
  multidimensional hierachies (content structures + hierarchical tagging
 like
  in CQ 5 [1]).
 
  The task I would like to fullfill: Find the semantically closest nodes
 for a
  given node.
 
  I postulate that structure represents the semantic relation as well as
 the
  referenced tags are in a hierarchie that represents semantic relations.
  Furthermore I postulate subnodes are semanticaly a subset of the type
 of
  the parentnode (not thinking of jcr-types but in semantical
 classifications)
  This leads into the following thesis: The distance to the closest shared
  parentnode represents the unidirectional distance of a node to another
 node.
  The result is that a whole branch has the same distance to a node. (which
  should be correct since the subnode in the branch belongs to the parent
 node
  which connects the branches we have to look at).
 
  My try to figure out a good way to produce an index for this really seams
 to
  be hard so I rethought my assumptions and came up with the following way
 of
  determining the distance without indexing the explicit distance (came up
  with this thought after reading a bit about the Analyzers and Stemming).
 
  1. For indexing all referenced taghandles and the own handle will be
 taken
  into account for indexing
  2. an analyzer produces stringtokens out of each handle. Each handle will
 be
  split up in multiple handles by removing the last node till the rootnode
 is
  reached (so the node and every parentnode is indexed for this node as
 well
  as for each referenced tag)

 this will only work as long as you don't move nodes. moving a node in
 jackrabbit is a light weight operation, which means only the moved
 node is re-indexed. all descendant nodes are kept untouched even
 though their path (handle) changed!

 regards
  marcel

  3. The query has to built based on a given handle since I want to search
 for
  the semantically closest nodes.
  4. The query is built the same way as the Analyzer has to split the
 handle
  in all parent handles.
  Result: A 100% match can only be produced for the same node (for all
 other
  nodes at least the own handle of the node is missing). The semantically
  closer a node is the more handles will match wich will result in an
 ordering
  as I intended. Et Voilá we have all we need to search for search
  semantically close pages in a proper sorting order.
 
  I might have a gap in my conclusions but didn't realise it yet, Id love
 to
  have some feedback and would appreciate some help to get startet with the
  mentioned proof of concept.
 
  WDYT?
 
  Best regards,
  Dominik
 
  [1] http://dev.day.com/microsling/content/blogs/main/cq5tags.html
 



Semantic distance search

2009-05-21 Thread Dominik Süß
Hi everybody,

after having some time of indirect contact with JCR throught sling and day
crx/cq I now think it's time to get in touch with jackrabbit directly. As
the subject says I do this after having an idea which I'd like to share and
need some help to realize (since my lucene experiences are close to nothing
but pure usage  theory). I did try to start with a proof of concept but as
I looked in the current implementations of search in jcr I had to realize I
need someone who could give me a jumpstart and does the first steps together
with me. So here I go with my idea:

I recently had some thoughts about something I'd call sementic distance in
multidimensional hierachies (content structures + hierarchical tagging like
in CQ 5 [1]).

The task I would like to fullfill: Find the semantically closest nodes for a
given node.

I postulate that structure represents the semantic relation as well as the
referenced tags are in a hierarchie that represents semantic relations.
Furthermore I postulate subnodes are semanticaly a subset of the type of
the parentnode (not thinking of jcr-types but in semantical classifications)
This leads into the following thesis: The distance to the closest shared
parentnode represents the unidirectional distance of a node to another node.
The result is that a whole branch has the same distance to a node. (which
should be correct since the subnode in the branch belongs to the parent node
which connects the branches we have to look at).

My try to figure out a good way to produce an index for this really seams to
be hard so I rethought my assumptions and came up with the following way of
determining the distance without indexing the explicit distance (came up
with this thought after reading a bit about the Analyzers and Stemming).

1. For indexing all referenced taghandles and the own handle will be taken
into account for indexing
2. an analyzer produces stringtokens out of each handle. Each handle will be
split up in multiple handles by removing the last node till the rootnode is
reached (so the node and every parentnode is indexed for this node as well
as for each referenced tag)
3. The query has to built based on a given handle since I want to search for
the semantically closest nodes.
4. The query is built the same way as the Analyzer has to split the handle
in all parent handles.
Result: A 100% match can only be produced for the same node (for all other
nodes at least the own handle of the node is missing). The semantically
closer a node is the more handles will match wich will result in an ordering
as I intended. Et Voilá we have all we need to search for search
semantically close pages in a proper sorting order.

I might have a gap in my conclusions but didn't realise it yet, Id love to
have some feedback and would appreciate some help to get startet with the
mentioned proof of concept.

WDYT?

Best regards,
Dominik

[1] http://dev.day.com/microsling/content/blogs/main/cq5tags.html


Semantic distance search

2009-05-21 Thread Dominik Süß
Hi everybody,

after having some time of indirect contact with JCR throught sling and day
crx/cq I now think it's time to get in touch with jackrabbit directly. As
the subject says I do this after having an idea which I'd like to share and
need some help to realize (since my lucene experiences are close to nothing
but pure usage  theory). I did try to start with a proof of concept but as
I looked in the current implementations of search in jcr I had to realize I
need someone who could give me a jumpstart and does the first steps together
with me. So here I go with my idea:

I recently had some thoughts about something I'd call sementic distance in
multidimensional hierachies (content structures + hierarchical tagging like
in CQ 5 [1]).

The task I would like to fullfill: Find the semantically closest nodes for a
given node.

I postulate that structure represents the semantic relation as well as the
referenced tags are in a hierarchie that represents semantic relations.
Furthermore I postulate subnodes are semanticaly a subset of the type of
the parentnode (not thinking of jcr-types but in semantical classifications)
This leads into the following thesis: The distance to the closest shared
parentnode represents the unidirectional distance of a node to another node.
The result is that a whole branch has the same distance to a node. (which
should be correct since the subnode in the branch belongs to the parent node
which connects the branches we have to look at).

My try to figure out a good way to produce an index for this really seams to
be hard so I rethought my assumptions and came up with the following way of
determining the distance without indexing the explicit distance (came up
with this thought after reading a bit about the Analyzers and Stemming).

1. For indexing all referenced taghandles and the own handle will be taken
into account for indexing
2. an analyzer produces stringtokens out of each handle. Each handle will be
split up in multiple handles by removing the last node till the rootnode is
reached (so the node and every parentnode is indexed for this node as well
as for each referenced tag)
3. The query has to built based on a given handle since I want to search for
the semantically closest nodes.
4. The query is built the same way as the Analyzer has to split the handle
in all parent handles.
Result: A 100% match can only be produced for the same node (for all other
nodes at least the own handle of the node is missing). The semantically
closer a node is the more handles will match wich will result in an ordering
as I intended. Et Voilá we have all we need to search for search
semantically close pages in a proper sorting order.

I might have a gap in my conclusions but didn't realise it yet, Id love to
have some feedback and would appreciate some help to get startet with the
mentioned proof of concept.

WDYT?

Best regards,
Dominik

[1] http://dev.day.com/microsling/content/blogs/main/cq5tags.html