[Neo4j] possibility to merge some neo4j databases?

2011-11-29 Thread osallou
Hi,
I need to batch insert millions of data in neo4j.
It is quite difficult to keep all in a Map to get node ids, so it needs
frequent lookups in index to get some node ids for relationships, and
result is quite low.

Is there any way to build several neo4j databases (independantly) then
to merge them? (I could build many small db in parallel)

Thanks

Olivier

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/possibility-to-merge-some-neo4j-databases-tp3544694p3544694.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] possibility to merge some neo4j databases?

2011-11-29 Thread Craig Taverner
There are two approaches I can think of:
- use a better index for mapping ids. Lucent is too slow. Memory hashtables
are memory bound.Peter has been investigating alternative dbs like bdb. I
tried, but did not finish a hashmap of cached arrays, and Chris wrote his
big data import project on github, which is a hashmap of cached hashmaps.
Many promising solutions, but none yet complete. All Target the general
case of id mapping.
- for this specific case, merging small databases, I had an idea a couple
of years ago which I still think will work. Bulk appending entire
databases, by offsetting all internal ids by the current max id. I remember
the reason Johan did not like this idea was that it suffered from the same
flaws as the batch inserter, locking the entire db, no rollback and risk of
entire db corruption. For people happy with the batch inserter, perhaps
this is still an option, but unlikely to get prioritized by the neo team
because if the corruption risks. It would, however, perform spectacularly
well since the id map is a trivial function.

Personally I hope someone completes Chris persistent hashmap or a similar
solution. Id maps are a recurring theme and would be very valuable.
On Nov 29, 2011 12:07 PM, "osallou"  wrote:

> Hi,
> I need to batch insert millions of data in neo4j.
> It is quite difficult to keep all in a Map to get node ids, so it needs
> frequent lookups in index to get some node ids for relationships, and
> result is quite low.
>
> Is there any way to build several neo4j databases (independantly) then
> to merge them? (I could build many small db in parallel)
>
> Thanks
>
> Olivier
>
> --
> View this message in context:
> http://neo4j-community-discussions.438527.n3.nabble.com/possibility-to-merge-some-neo4j-databases-tp3544694p3544694.html
> Sent from the Neo4j Community Discussions mailing list archive at
> Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] possibility to merge some neo4j databases?

2011-12-02 Thread Michael Hunger
Sure the limitations apply, but as only the target database would be corrupted 
and none of the ones
being used for the import that should be ok.

That is actually like a nice lab-day project.

I'll add it to the list.

Michael

Am 29.11.2011 um 15:18 schrieb Craig Taverner:

> There are two approaches I can think of:
> - use a better index for mapping ids. Lucent is too slow. Memory hashtables
> are memory bound.Peter has been investigating alternative dbs like bdb. I
> tried, but did not finish a hashmap of cached arrays, and Chris wrote his
> big data import project on github, which is a hashmap of cached hashmaps.
> Many promising solutions, but none yet complete. All Target the general
> case of id mapping.
> - for this specific case, merging small databases, I had an idea a couple
> of years ago which I still think will work. Bulk appending entire
> databases, by offsetting all internal ids by the current max id. I remember
> the reason Johan did not like this idea was that it suffered from the same
> flaws as the batch inserter, locking the entire db, no rollback and risk of
> entire db corruption. For people happy with the batch inserter, perhaps
> this is still an option, but unlikely to get prioritized by the neo team
> because if the corruption risks. It would, however, perform spectacularly
> well since the id map is a trivial function.
> 
> Personally I hope someone completes Chris persistent hashmap or a similar
> solution. Id maps are a recurring theme and would be very valuable.
> On Nov 29, 2011 12:07 PM, "osallou"  wrote:
> 
>> Hi,
>> I need to batch insert millions of data in neo4j.
>> It is quite difficult to keep all in a Map to get node ids, so it needs
>> frequent lookups in index to get some node ids for relationships, and
>> result is quite low.
>> 
>> Is there any way to build several neo4j databases (independantly) then
>> to merge them? (I could build many small db in parallel)
>> 
>> Thanks
>> 
>> Olivier
>> 
>> --
>> View this message in context:
>> http://neo4j-community-discussions.438527.n3.nabble.com/possibility-to-merge-some-neo4j-databases-tp3544694p3544694.html
>> Sent from the Neo4j Community Discussions mailing list archive at
>> Nabble.com.
>> ___
>> Neo4j mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>> 
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user