RE: Use cases for the graph streams

2020-05-26 Thread Nightingale, Jonathan A (US)
Without getting too in the weeks with our product, we have a bunch of solr 
records that represent entities and their relationships to other entities or 
files. For example a document may describe a bunch of people. We have entries 
for the people as well as the document. We also have entries that represent 
connections between these people based on things described in the document.

From those solr records we have a bunch of docIds as references in each record 
that lets us link them. We build a graph in a separate graph store so we can to 
traversal on it. We do semantic filtering like find people nodes that are 
related to people nodes described in this document. That relationship can be 
expanded to allow for greater walks on the graph, so find everything up to N 
steps from this person.

We also allow just viewing all nodes on the graph that are connected by N steps 
from a target node and allow the user to traverse that way to just explore the 
information as they then shift to another node and display the new subgraph 
from their new focus node.

So those are the kinds of things I was hoping to do with the gather nodes 
functions in solr but I couldn't find a simple way to do it.

Jonathan

-Original Message-
From: Joel Bernstein  
Sent: Thursday, May 21, 2020 9:57 AM
To: solr-user@lucene.apache.org
Subject: Re: Use cases for the graph streams

*** WARNING ***
EXTERNAL EMAIL -- This message originates from outside our organization.


Good question. Let me first point to an interesting example in the Visual Guide 
to Streaming Expressions and Math Expressions:

https://github.com/apache/lucene-solr/blob/visual-guide/solr/solr-ref-guide/src/search-sample.adoc#nodes

This example gets to the heart of the core use case for the nodes expression 
which is to discover the relationships between nodes in a graph.
So it's a discovery tool to learn something new about the data that you can't 
see without having this specific ability of walking the nodes in a graph.

In the broader context the nodes expression is part of a much wider set of 
tools that allow people to use Solr to explore the relationships in their data. 
This is described here:

https://github.com/apache/lucene-solr/blob/visual-guide/solr/solr-ref-guide/src/math-expressions.adoc

The goal of all this is to move search engines beyond basic aggregations to 
study the correlations and relationships within the data set.

Graph traversal is part of this broader goal which will get developed more over 
time. I'd be interested in hearing more about specific graph use cases that 
you're interested in solving.

Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, May 20, 2020 at 12:32 PM Nightingale, Jonathan A (US) < 
jonathan.nighting...@baesystems.com> wrote:

> This is kind of  broad question, but I was playing with the graph 
> streams and was having trouble making the tools work for what I wanted 
> to do. I'm wondering if the use case for the graph streams really 
> supports standard graph queries you might use with Gemlin or the like? 
> I ask because right now we have two implementations of our data 
> storage to support these two ways of looking at it, the standard query and 
> the semantic filtering.
>
> The usecases I usually see for the graph streams always seem to be 
> limited to one link traversal for finding things related to nodes 
> gathered from a query. But even with that it wasn't clear the best way 
> to do things with lists of docvalues. So for example if you wanted to 
> represent a node that had many doc values I had to use cross products 
> to make a node for each doc value. The traversal didn't allow for that 
> kind of node linking inherently it seemed.
>
> So my question really is (and maybe this is not the place for this) 
> what is the intent of these graph features and what is the goal for 
> them in the future? I was really hoping at one point to only use solr 
> for our product but it didn't seem feasible, at least not easily.
>
> Thanks for all your help
> Jonathan
>
> Jonathan Nightingale
> GXP Solutions Engineer
> (office) 315 838 2273
> (cell) 315 271 0688
>
>


Re: Use cases for the graph streams

2020-05-21 Thread Joel Bernstein
Good question. Let me first point to an interesting example in the Visual
Guide to Streaming Expressions and Math Expressions:

https://github.com/apache/lucene-solr/blob/visual-guide/solr/solr-ref-guide/src/search-sample.adoc#nodes

This example gets to the heart of the core use case for the nodes
expression which is to discover the relationships between nodes in a graph.
So it's a discovery tool to learn something new about the data that you
can't see without having this specific ability of walking the nodes in a
graph.

In the broader context the nodes expression is part of a much wider set of
tools that allow people to use Solr to explore the relationships in their
data. This is described here:

https://github.com/apache/lucene-solr/blob/visual-guide/solr/solr-ref-guide/src/math-expressions.adoc

The goal of all this is to move search engines beyond basic aggregations to
study the correlations and relationships within the data set.

Graph traversal is part of this broader goal which will get developed more
over time. I'd be interested in hearing more about specific graph use cases
that you're interested in solving.

Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, May 20, 2020 at 12:32 PM Nightingale, Jonathan A (US) <
jonathan.nighting...@baesystems.com> wrote:

> This is kind of  broad question, but I was playing with the graph streams
> and was having trouble making the tools work for what I wanted to do. I'm
> wondering if the use case for the graph streams really supports standard
> graph queries you might use with Gemlin or the like? I ask because right
> now we have two implementations of our data storage to support these two
> ways of looking at it, the standard query and the semantic filtering.
>
> The usecases I usually see for the graph streams always seem to be limited
> to one link traversal for finding things related to nodes gathered from a
> query. But even with that it wasn't clear the best way to do things with
> lists of docvalues. So for example if you wanted to represent a node that
> had many doc values I had to use cross products to make a node for each doc
> value. The traversal didn't allow for that kind of node linking inherently
> it seemed.
>
> So my question really is (and maybe this is not the place for this) what
> is the intent of these graph features and what is the goal for them in the
> future? I was really hoping at one point to only use solr for our product
> but it didn't seem feasible, at least not easily.
>
> Thanks for all your help
> Jonathan
>
> Jonathan Nightingale
> GXP Solutions Engineer
> (office) 315 838 2273
> (cell) 315 271 0688
>
>


Use cases for the graph streams

2020-05-20 Thread Nightingale, Jonathan A (US)
This is kind of  broad question, but I was playing with the graph streams and 
was having trouble making the tools work for what I wanted to do. I'm wondering 
if the use case for the graph streams really supports standard graph queries 
you might use with Gemlin or the like? I ask because right now we have two 
implementations of our data storage to support these two ways of looking at it, 
the standard query and the semantic filtering.

The usecases I usually see for the graph streams always seem to be limited to 
one link traversal for finding things related to nodes gathered from a query. 
But even with that it wasn't clear the best way to do things with lists of 
docvalues. So for example if you wanted to represent a node that had many doc 
values I had to use cross products to make a node for each doc value. The 
traversal didn't allow for that kind of node linking inherently it seemed.

So my question really is (and maybe this is not the place for this) what is the 
intent of these graph features and what is the goal for them in the future? I 
was really hoping at one point to only use solr for our product but it didn't 
seem feasible, at least not easily.

Thanks for all your help
Jonathan

Jonathan Nightingale
GXP Solutions Engineer
(office) 315 838 2273
(cell) 315 271 0688