Re: [Neo4j] Neo4j Data Capacities

2011-11-24 Thread bm3780
Ok, that previous answer was not representing my particular question thread. 
My application is a new initiative, so there is no existing data store. 
However, we do intent on ingesting other data sets to make our data more
interesting by our users.  Some of these data sets we are interested in is
on the order of billions of nodes if we were to actually ingest them, so
that is why we are trying to brainstorm on possible solutions.  Our initial
initiative, however, is to only store our own native data, which is as I
said before very structured, however has some social aspects and the
structured data itself is very interconnected to itself and other parts of
structured data.

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Data-Capacities-tp3533552p3533729.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j Data Capacities

2011-11-24 Thread Pekka Honkonen
Great, as you see. We are small startup   www.epygg.com and one major CC
company has contact us, because we have one method what is really wanted by
them, i dont want to just give license for then, so i like to build some
solution top of my technology. Key is that we can generate one field of
data, what need to combine all traditional credit card payment data, in
find correltation between those.

Pekka




On Thu, Nov 24, 2011 at 4:05 PM, Jim Webber  wrote:

> Hi Pekka,
>
> There are already (prominent) folks using Neo4j in that kind of credit
> card fraud detection. I hope some of them could volunteer their experiences
> (though not necessarily their proprietary clever stuff) on this list.
>
> Jim
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j Data Capacities

2011-11-24 Thread Jim Webber
Hi Pekka,

There are already (prominent) folks using Neo4j in that kind of credit card 
fraud detection. I hope some of them could volunteer their experiences (though 
not necessarily their proprietary clever stuff) on this list.

Jim
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j Data Capacities

2011-11-24 Thread Pekka Honkonen
No, we don't got that store, but we are developing new type of fraud
detection solution to CC vendor. We have one asset what they are looking,
but we like to add that real time correlation identify to solution. If we
manage to get deal, then that solution will process that amount of
transactions. We got numbers from CC company.
-Pekka

On Thu, Nov 24, 2011 at 3:54 PM, Michael Hunger <
michael.hun...@neotechnology.com> wrote:

> Sounds great, so do you already reach those numbers?
>
> How do you store your data today?
>
> Michael
>
> Am 24.11.2011 um 14:53 schrieb bm3780:
>
> > All of my data is interconnected and rich.  This is why I like the idea
> of a
> > graph.
> >
> > --
> > View this message in context:
> http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Data-Capacities-tp3533552p3533673.html
> > Sent from the Neo4j Community Discussions mailing list archive at
> Nabble.com.
> > ___
> > Neo4j mailing list
> > User@lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
>
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j Data Capacities

2011-11-24 Thread Michael Hunger
Sounds great, so do you already reach those numbers? 

How do you store your data today?

Michael

Am 24.11.2011 um 14:53 schrieb bm3780:

> All of my data is interconnected and rich.  This is why I like the idea of a
> graph.
> 
> --
> View this message in context: 
> http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Data-Capacities-tp3533552p3533673.html
> Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j Data Capacities

2011-11-24 Thread bm3780
All of my data is interconnected and rich.  This is why I like the idea of a
graph.

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Data-Capacities-tp3533552p3533673.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j Data Capacities

2011-11-24 Thread Michael Hunger
If your data exceeds those amounts, then polyglot persistence is probably the 
way to go.

Is the other part of your data also interconnected and rich or is it just the 
social part?

All that not only depends on the storage but also a lot on the use-cases and 
scenarions how you are going to use that data in the future.

What kinds of apps, services, user(-requests) you have to server.

If you need any support for a PoC don't hesitate to contact us.

Cheers,

Michael
Am 24.11.2011 um 14:13 schrieb bm3780:

> I'm struggling to determine whether graph is a good fit for my domain.  Most
> of my application is structured data.  However, there are some parts that
> are of a social nature and a graph seems like a good match.  I guess my fear
> is having all of the data in a single store, such as a graph, would cause
> problems down the road due to the limitations.
> 
> Potentially I need to go down the path of polyglot persistence...storing
> just the social aspect of my data in the graph and storing the other data in
> a document store.  I was try trying to simply our architecture by using only
> a graph, which would make O&M much easier down the road.
> 
> --
> View this message in context: 
> http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Data-Capacities-tp3533552p3533597.html
> Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j Data Capacities

2011-11-24 Thread Chris Gioran
Yes, this is true, with a few notes:

ID reuse complicates things a bit, meaning that if you delete nodes
and relationships some ids will remain unused until you restart the
database. Unclean shutdowns also may require scanning of the store
files to determine unused records - the
Config.REBUILD_IDGENERATORS_FAST parameter. So, the 35 bit address
space is an upper limit. Normally this number of "lost" records is
minuscule and easily recoverable so not a big deal.

The 36 bit address space for properties is a low limit - the id reuse
issue is practically non-existent for properties and since 1.5 there
is no 1-1 correspondence between property id and property entry (the
smallest ratio is 1:4). It all depends on the type of property - if it
classifies as a short string or short array and how big
(http://docs.neo4j.org/chunked/milestone/short-strings.html).

So you could have a db with around 34 billion nodes with one OUTGOING
relationship per node (so up to two per node, one INCOMING and one
OUTGOING, since every relationship connects two nodes) and at least 68
billion properties, with a max of 68*4=272 billion properties.

That's a lot of stuff! For reference, the smallest of those files will
be the node store with a size of (9 bytes/record * 2^35 records)/(2^30
bytes/gigabyte)  = 288 gigabytes. So you will start hitting machine
restrictions before you run out of id space.

cheers,
CG

On Thu, Nov 24, 2011 at 2:55 PM, bm3780  wrote:
> I've read that Neo4j has data capacity limitations
> (http://docs.neo4j.org/chunked/milestone/capabilities-capacity.html).  I
> would like to confirm my understandings that the node, properties, and
> relationships limitations are for each type (e.g. AND condition), not an
> either/or  (e.g. OR condition).
>
> Neo4j can hold:
>   * ~34 billions nodes, AND
>   * ~34 billion relationships, AND
>   * ~68 billion properties
>
> So I could theoretically have a single graph with 34 billion nodes, where
> each node had two properties and a single relationship.
>
> --
> View this message in context: 
> http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Data-Capacities-tp3533552p3533552.html
> Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j Data Capacities

2011-11-24 Thread bm3780
I'm struggling to determine whether graph is a good fit for my domain.  Most
of my application is structured data.  However, there are some parts that
are of a social nature and a graph seems like a good match.  I guess my fear
is having all of the data in a single store, such as a graph, would cause
problems down the road due to the limitations.

Potentially I need to go down the path of polyglot persistence...storing
just the social aspect of my data in the graph and storing the other data in
a document store.  I was try trying to simply our architecture by using only
a graph, which would make O&M much easier down the road.

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Data-Capacities-tp3533552p3533597.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j Data Capacities

2011-11-24 Thread Michael Hunger
Correct,

is this an issue for your domain/data model?

If so could you something about your use-case / context?

Thanks a lot

Michael

Am 24.11.2011 um 13:55 schrieb bm3780:

> I've read that Neo4j has data capacity limitations
> (http://docs.neo4j.org/chunked/milestone/capabilities-capacity.html).  I
> would like to confirm my understandings that the node, properties, and
> relationships limitations are for each type (e.g. AND condition), not an
> either/or  (e.g. OR condition).
> 
> Neo4j can hold:
>   * ~34 billions nodes, AND
>   * ~34 billion relationships, AND
>   * ~68 billion properties
> 
> So I could theoretically have a single graph with 34 billion nodes, where
> each node had two properties and a single relationship.
> 
> --
> View this message in context: 
> http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Data-Capacities-tp3533552p3533552.html
> Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j Data Capacities

2011-11-24 Thread Peter Neubauer
Yes.
However, before that you will probably run into other limitations,
like file sizes, IO and RAM. That is why we are a bit careful about
just going to Longs or UUIDs.

Anything you are thinking of in particular?

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org              - NOSQL for the Enterprise.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.



On Thu, Nov 24, 2011 at 1:55 PM, bm3780  wrote:
> I've read that Neo4j has data capacity limitations
> (http://docs.neo4j.org/chunked/milestone/capabilities-capacity.html).  I
> would like to confirm my understandings that the node, properties, and
> relationships limitations are for each type (e.g. AND condition), not an
> either/or  (e.g. OR condition).
>
> Neo4j can hold:
>   * ~34 billions nodes, AND
>   * ~34 billion relationships, AND
>   * ~68 billion properties
>
> So I could theoretically have a single graph with 34 billion nodes, where
> each node had two properties and a single relationship.
>
> --
> View this message in context: 
> http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Data-Capacities-tp3533552p3533552.html
> Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user