[arangodb-google] Graph traversal and subgraphs

2016-07-14 Thread Roman
Hi, I have graph (see attachment for example) with edges which have 
property type. I want to traverse graph, but use only specific edges (for 
example type == "cdp"). I tried following query:

for d in vDevice filter d.hostname == "lab54unl85AC172"
FOR v,e,p IN 1..1 any d GRAPH 'linkGraph' OPTIONS { 'uniqueVertices': 
'global', 'uniqueEdges': 'global'} 
filter e.type == "cdp"
return {v:v.hostname, e:e.type, pe: p.edges[*].type, 
pv:p.vertices[*].hostname}

or with filter on path

filter p.edges[-1].type == "cdp"

But with no correct results. It seems that filters are only applied on 
results and traverse itself is not affected by these filters. Same 
situation applies if I tried to filter on vertices.

Is there a way how to run traverse on subgraph (define subset of edges and 
vertices and then run traverse on them).

Thanks

Roman




-- 
You received this message because you are subscribed to the Google Groups 
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to arangodb+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[arangodb-google] Re: Graph traversal and subgraphs

2016-07-14 Thread Roman
If I looked to documentation

https://docs.arangodb.com/3.0/AQL/Graphs/Traversals.html

there should be something like this 

Traversals on graphs:
 Id   Depth   Vertex collections   Edge collections   Filter conditions
  2   1..3circles  edges  `Path`.`edges`[0] -> 
v.`label` == "right_foo"

But in my explain it's not:

Traversals on graphs:
 Id   Depth  Vertex collections   Edge collections   Filter conditions
  5   1..1   vDevice  eLink  

-- 
You received this message because you are subscribed to the Google Groups 
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to arangodb+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[arangodb-google] Managed ArangoDB servcie

2016-12-06 Thread Roman Yarinovsky
Hi,

I am doing a research about some needs in a database and I really liked 
ArangoDB, The only issue is that I couldn't find any managed services or 
managed hosts for ArangoDB.

For an example in Amazon AWS services the RDS allows us to easily to scale 
up, without worrying about the clustering and configuration.

Is there any service that can manage this for me, or should I manage this 
myself?

-- 
You received this message because you are subscribed to the Google Groups 
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to arangodb+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[arangodb-google] Faceted Search Performance

2017-09-14 Thread Roman Kuzmik
We are evaluating ArangoDB performance in space of facets calculations.
There are number of other products capable of doing the same, either via 
special API  or query language:

   - MarkLogic Facets 
   - ElasticSearch Aggregations 
   

   - Solr Faceting 
   
   - etc

We understand, there is no special API in Arango to calculate factes 
explicitly.
But in reality, it is not needed, thanks for a comprehensive AQL it can be 
easily achieved via simple query, like:

 FOR a in Asset 
  COLLECT attr = a.attribute1 INTO g
 RETURN { value: attr, count: length(g) }

This query calculate a facet on *attribute1* and yields frequency in the 
form of:

[
  {
"value": "test-attr1-1",
"count": 200
  },
  {
"value": "test-attr1-2",
"count": 200
  },
  {
"value": "test-attr1-3",
"count": 300
  }
]


It is saying, that across my entire collection *attribute1* took three 
forms (test-attr1-1, test-attr1-2 and test-attr1-3) with related counts 
provided.
Pretty much we run a DISTINCT query and aggregated counts.

Looks simple and clean. With only one, but really big issue - *performance*.

Provided query above runs for !*31 seconds*! on top of the test collection 
with only *8M* documents.
We have experimented with different index types, storage engines (with 
rocksdb and without), investigating explanation plans at no avail.
Test documents we use in this test are very concise with only three short 
attributes.

We would appreciate any input at this point.
Either we doing something wrong. Or ArangoDB simply is not designed to 
perform in this particular area.

btw, ultimate goal would be to run something like the following in 
under-second time:

LET docs = (FOR a IN Asset 

  FILTER a.name like 'test-asset-%'

  SORT a.name

 RETURN a)

LET attribute1 = (

 FOR a in docs 

  COLLECT attr = a.attribute1 INTO g

 RETURN { value: attr, count: length(g[*])}

)

LET attribute2 = (

 FOR a in docs 

  COLLECT attr = a.attribute2 INTO g

 RETURN { value: attr, count: length(g[*])}

)

LET attribute3 = (

 FOR a in docs 

  COLLECT attr = a.attribute3 INTO g

 RETURN { value: attr, count: length(g[*])}

)

LET attribute4 = (

 FOR a in docs 

  COLLECT attr = a.attribute4 INTO g

 RETURN { value: attr, count: length(g[*])}

)

RETURN {

  counts: (RETURN {

total: LENGTH(docs), 

offset: 2, 

to: 4, 

facets: {

  attribute1: {

from: 0, 

to: 5,

total: LENGTH(attribute1)

  },

  attribute2: {

from: 5, 

to: 10,

total: LENGTH(attribute2)

  },

  attribute3: {

from: 0, 

to: 1000,

total: LENGTH(attribute3)

  },

  attribute4: {

from: 0, 

to: 1000,

total: LENGTH(attribute4)

  }

}

  }),

  items: (FOR a IN docs LIMIT 2, 4 RETURN {id: a._id, name: a.name}),

  facets: {

attribute1: (FOR a in attribute1 SORT a.count LIMIT 0, 5 return a),

attribute2: (FOR a in attribute2 SORT a.value LIMIT 5, 10 return a),

attribute3: (FOR a in attribute3 LIMIT 0, 1000 return a),

attribute4: (FOR a in attribute4 SORT a.count, a.value LIMIT 0, 1000 
return a)

   }

}

Thanks!



-- 
You received this message because you are subscribed to the Google Groups 
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to arangodb+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[arangodb-google] Re: Faceted Search Performance

2017-09-14 Thread Roman Kuzmik
Btw, Fyi, first query (AKA: LENGTH(g)) with an index on attribute1 runs 
almost same as second query (AKA: WITH COUNT).
Here, 2nd query with an index takes 4.4 seconds.
But it is still, just one facet. Usually you need a bunch, like in my 
"long" query in very first post.
Let me re-write it using WITH COUNT and create an index on each facet 
will see how long it will take. Currently, "as-is" it takes 117 seconds.

Thanks for looking into this issue.

-- 
You received this message because you are subscribed to the Google Groups 
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to arangodb+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[arangodb-google] Re: Faceted Search Performance

2017-09-18 Thread Roman Kuzmik
compiled your changes from feature/mmfiles-hash-lookup-performance
indeed, on single facet we are down to 4 seconds from 6 seconds (in the 
test case provided above). And no indexes needed.

hope it will make to master soon.

-- 
You received this message because you are subscribed to the Google Groups 
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to arangodb+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[arangodb-google] Re: Faceted Search Performance

2017-09-18 Thread Roman Kuzmik
4 seconds per facet, thus adding 3 more it takes us to 16 seconds.
btw, why is that, arango is doing full scan anyways. is it doing it 4 times 
with the query bellow? Is there any way to make it smarter?

 LET docs = (FOR a IN Asset 
 RETURN a)
LET attribute1 = (
 FOR a in docs 
  COLLECT attr = a.attribute1 WITH COUNT INTO length
 RETURN { value: attr, count: length}
)
LET attribute2 = (
 FOR a in docs 
  COLLECT attr = a.attribute2 WITH COUNT INTO length
 RETURN { value: attr, count: length}
)
LET attribute3 = (
 FOR a in docs 
  COLLECT attr = a.attribute3 WITH COUNT INTO length
 RETURN { value: attr, count: length}
)
LET attribute4 = (
 FOR a in docs 
  COLLECT attr = a.attribute4 WITH COUNT INTO length
 RETURN { value: attr, count: length}
)
RETURN {
  counts: (RETURN {
total: LENGTH(docs), 
offset: 2, 
to: 4, 
facets: {
  attribute1: {
from: 0, 
to: 5,
total: LENGTH(attribute1)
  },
  attribute2: {
from: 5, 
to: 10,
total: LENGTH(attribute2)
  },
  attribute3: {
from: 0, 
to: 1000,
total: LENGTH(attribute3)
  },
  attribute4: {
from: 0, 
to: 1000,
total: LENGTH(attribute4)
  }
}
  }),
  items: (FOR a IN docs LIMIT 2, 4 RETURN {id: a._id, name: a.name}),
  facets: {
attribute1: (FOR a in attribute1 SORT a.count LIMIT 0, 5 return a),
attribute2: (FOR a in attribute2 SORT a.value LIMIT 5, 10 return a),
attribute3: (FOR a in attribute3 LIMIT 0, 1000 return a),
attribute4: (FOR a in attribute4 SORT a.count, a.value LIMIT 0, 1000 
return a)
   }
}


-- 
You received this message because you are subscribed to the Google Groups 
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to arangodb+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[arangodb-google] Re: Faceted Search Performance

2017-09-19 Thread Roman Kuzmik
Thanks for a hint!

We have wrote small service faced to calculate facets.
It split my huge AQL provided above into 5 queries:

   - main - to filter, sort and retrieve matching entities:
  - 
  
  LET docs = (FOR a IN Asset
  
   FILTER a.name like 'test-asset-%'
  
   SORT a.name
  
  RETURN a)
  
  RETURN {
  
   counts: (RETURN {
  
 total: LENGTH(docs),
  
 offset: 0,
  
 to: 5
  
   }),
  
   items: (FOR a IN docs LIMIT 0, 5 RETURN a)
  
  }
  
  - 4 small ones to purely calculate facets:
  - 
  
  LET docs = (FOR a IN Asset
  
   FILTER a.name like 'test-asset-%'
  
  RETURN a)
  
  LET attributeX = (
  
  FOR a in docs
  
   COLLECT attr = a.attributeX WITH COUNT INTO length
  
  RETURN { value: attr, count: length}
  
  )
  
  RETURN {
  
   counts: (RETURN {
  
 total: LENGTH(docs),
  
 offset: 0,
  
 to: -1,
  
 facets: {
  
   attributeX: {
  
 from: 0,
  
 to: 1000,
  
 total: LENGTH(attributeX)
  
   }
  
 }
  
   }),
  
   facets: {
  
 attributeX: (FOR a in attributeX LIMIT 0, 1000 return a)
  
   }
  
  }
  
We run these using Java's 8 Fork/Join and basically execute Map(split into 
sub-queries)/Reduce(merge results) potentially against ArangoDB cluster.
We run custom ArangoDB build from Jan's feature branch. And results are 
pretty impressive.
Remember, we have started with 140 secs for the full AQL and now we are 
down to 11 seconds (with sort/filter + skiplist on name) or 4 seconds (w/o 
soft/filter and 4 facets).

I think, we are satisfied for now :-) and hope this PR will make it to 
'master'.

Also, would be awesome to see if AQL-pipeline will be advanced in the 
future to accomodate more analytical type of queries (facets with 
sub-facets, ElasticSearch style of sub-grouping, etc)

-- 
You received this message because you are subscribed to the Google Groups 
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to arangodb+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[arangodb-google] Re: Faceted Search Performance

2017-09-14 Thread Roman Kuzmik
Thanks Jan for your reply!

But, yes, we have tried "2.x old school" approach* WITH COUNT*, as well as 
brand new* DISTINCT*.
Both yields similar sluggish results :-/

-- 
You received this message because you are subscribed to the Google Groups 
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to arangodb+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[arangodb-google] Re: Faceted Search Performance

2017-09-20 Thread Roman Kuzmik
done

-- 
You received this message because you are subscribed to the Google Groups 
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to arangodb+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.