Re: [Neo4j] Neo4j with MapReduce inserts

2011-06-17 Thread Jim Webber
Hello Sulabh,

We're going to need a little more information before we can help.

Can you tell us how it fails? Are you trying to run a batch inserter on 
different databases on each of your parallel jobs? 

Jim
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j with MapReduce inserts

2011-06-17 Thread Michael Hunger
Also, 
what technology are you writing those map-reduce jobs with ? (framework, 
runtime-env, etc).

Some code samples would be great as well.

Cheers

Michael

Am 17.06.2011 um 22:24 schrieb Jim Webber:

 Hello Sulabh,
 
 We're going to need a little more information before we can help.
 
 Can you tell us how it fails? Are you trying to run a batch inserter on 
 different databases on each of your parallel jobs? 
 
 Jim
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j with MapReduce inserts

2011-06-17 Thread sulabh choudhury
Well as I mentioned the code does not fail anywhere, it runs it full course
and just skips the  writing to the graph part.
I have just one graph and I pass just 1 instance of the batchInserter  to
the map function.

My code is in Scala, sample code attached below


class ExportReducer extends Reducer[Text,MapWritable,LongWritable,Text]{

  type Context = org.apache.hadoop.mapreduce.Reducer[Text, MapWritable,
LongWritable, Text]#Context

  @throws(classOf[Exception])
  override def reduce(key: Text, value: java.lang.Iterable[MapWritable],
context: Context) {

  var keys: Array[String] = key.toString.split(:)
  var uri1 = first + keys(0)
  var uri2 = last + keys(1)
  ExportReducerObject.propertiesUID.put(ID,uri1);
var node1 =
ExportReducerObject.batchInserter.createNode(ExportReducerObject.propertiesUID);

ExportReducerObject.indexService.add(node1,ExportReducerObject.propertiesUID)
  ExportReducerObject.propertiesCID.put(ID,uri2);
 var node2 =
ExportReducerObject.batchInserter.createNode(ExportReducerObject.propertiesCID);
ExportReducerObject.indexService.add(node2,ExportReducerObject.propertiesCID);

  ExportReducerObject.propertiesEdges.put(fullName,1.0);

ExportReducerObject.batchInserter.createRelationship(node1,node2,DynamicRelationshipType.withName(
fullName),ExportReducerObject.propertiesEdges)

  }

My graph properties are defined as below :-
val batchInserter = new BatchInserterImpl(graph,
BatchInserterImpl.loadProperties(neo4j.props))
val indexProvider = new LuceneBatchInserterIndexProvider(batchInserter)
val indexService =
indexProvider.nodeIndex(ID,MapUtil.stringMap(type,exact))


Mind it that the code works perfectly( writes to the graph) when running in
local mode.

On Fri, Jun 17, 2011 at 11:32 AM, sulabh choudhury sula...@gmail.comwrote:

 I am trying to write MapReduce job to do Neo4j Batchinserters.
 It works fine when I just run it like a java file(runs in local mode) and
 does the insert, but when I try to run it in the distributed mode it does
 not write to the graph.
 Is it issue related to permissions?
 I have no clue where to look.




-- 

-- 
Thanks and Regards,
Sulabh Choudhury
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j with MapReduce inserts

2011-06-17 Thread Michael Hunger
Hi Sulabh,

what do you mean by 'local' mode?

The batch inserter can only be used in a single threaded environment. You 
shouldn't use it in a concurrent env as it will fail unpredictably.

Please use the EmbeddedGraphDatabase instead.

Michael

Am 17.06.2011 um 23:20 schrieb sulabh choudhury:

 Well as I mentioned the code does not fail anywhere, it runs it full course 
 and just skips the  writing to the graph part.
 I have just one graph and I pass just 1 instance of the batchInserter  to the 
 map function.
 
 My code is in Scala, sample code attached below
 
 
 class ExportReducer extends Reducer[Text,MapWritable,LongWritable,Text]{
 
   type Context = org.apache.hadoop.mapreduce.Reducer[Text, MapWritable, 
 LongWritable, Text]#Context
 
   @throws(classOf[Exception])
   override def reduce(key: Text, value: java.lang.Iterable[MapWritable], 
 context: Context) {
 
   var keys: Array[String] = key.toString.split(:)
 var uri1 = first + keys(0)
 var uri2 = last + keys(1)
   ExportReducerObject.propertiesUID.put(ID,uri1);
   var node1 = 
 ExportReducerObject.batchInserter.createNode(ExportReducerObject.propertiesUID);
   
 ExportReducerObject.indexService.add(node1,ExportReducerObject.propertiesUID)
   ExportReducerObject.propertiesCID.put(ID,uri2);
   var node2 = 
 ExportReducerObject.batchInserter.createNode(ExportReducerObject.propertiesCID);
   
 ExportReducerObject.indexService.add(node2,ExportReducerObject.propertiesCID);
 
   ExportReducerObject.propertiesEdges.put(fullName,1.0);
   
 ExportReducerObject.batchInserter.createRelationship(node1,node2,DynamicRelationshipType.withName(fullName),ExportReducerObject.propertiesEdges)
 
   }
 
 My graph properties are defined as below :-
 val batchInserter = new BatchInserterImpl(graph, 
 BatchInserterImpl.loadProperties(neo4j.props))
 val indexProvider = new LuceneBatchInserterIndexProvider(batchInserter)
 val indexService = 
 indexProvider.nodeIndex(ID,MapUtil.stringMap(type,exact))
 
 
 Mind it that the code works perfectly( writes to the graph) when running in 
 local mode.
 
 On Fri, Jun 17, 2011 at 11:32 AM, sulabh choudhury sula...@gmail.com wrote:
 I am trying to write MapReduce job to do Neo4j Batchinserters.
 It works fine when I just run it like a java file(runs in local mode) and 
 does the insert, but when I try to run it in the distributed mode it does not 
 write to the graph.
 Is it issue related to permissions? 
 I have no clue where to look.
 
 
 
 -- 
 -- 
 Thanks and Regards,
 Sulabh Choudhury
 

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j with MapReduce inserts

2011-06-17 Thread sulabh choudhury
Are you referring that in a M/R environment each Map (or Reduce) process
will try to have its own instance of batchInserter and hence it would fail ?

WHen I say local I mean that the code works fine when I just use the M/R
api but fails when I try to run in distributed mode.

On Fri, Jun 17, 2011 at 2:25 PM, Michael Hunger 
michael.hun...@neotechnology.com wrote:

 Hi Sulabh,

 what do you mean by 'local' mode?

 The batch inserter can only be used in a single threaded environment. You
 shouldn't use it in a concurrent env as it will fail unpredictably.

 Please use the EmbeddedGraphDatabase instead.

 Michael

 Am 17.06.2011 um 23:20 schrieb sulabh choudhury:

 Well as I mentioned the code does not fail anywhere, it runs it full course
 and just skips the  writing to the graph part.
 I have just one graph and I pass just 1 instance of the batchInserter  to
 the map function.

 My code is in Scala, sample code attached below


 class ExportReducer extends Reducer[Text,MapWritable,LongWritable,Text]{

   type Context = org.apache.hadoop.mapreduce.Reducer[Text, MapWritable,
 LongWritable, Text]#Context

   @throws(classOf[Exception])
   override def reduce(key: Text, value: java.lang.Iterable[MapWritable],
 context: Context) {

   var keys: Array[String] = key.toString.split(:)
   var uri1 = first + keys(0)
   var uri2 = last + keys(1)
   ExportReducerObject.propertiesUID.put(ID,uri1);
 var node1 =
 ExportReducerObject.batchInserter.createNode(ExportReducerObject.propertiesUID);

 ExportReducerObject.indexService.add(node1,ExportReducerObject.propertiesUID)
   ExportReducerObject.propertiesCID.put(ID,uri2);
  var node2 =
 ExportReducerObject.batchInserter.createNode(ExportReducerObject.propertiesCID);

 ExportReducerObject.indexService.add(node2,ExportReducerObject.propertiesCID);

   ExportReducerObject.propertiesEdges.put(fullName,1.0);

 ExportReducerObject.batchInserter.createRelationship(node1,node2,DynamicRelationshipType.withName(fullName),ExportReducerObject.propertiesEdges)

   }

 My graph properties are defined as below :-
 val batchInserter = new BatchInserterImpl(graph,
 BatchInserterImpl.loadProperties(neo4j.props))
 val indexProvider = new LuceneBatchInserterIndexProvider(batchInserter)
 val indexService =
 indexProvider.nodeIndex(ID,MapUtil.stringMap(type,exact))


 Mind it that the code works perfectly( writes to the graph) when running in
 local mode.

 On Fri, Jun 17, 2011 at 11:32 AM, sulabh choudhury sula...@gmail.comwrote:

 I am trying to write MapReduce job to do Neo4j Batchinserters.
 It works fine when I just run it like a java file(runs in local mode) and
 does the insert, but when I try to run it in the distributed mode it does
 not write to the graph.
 Is it issue related to permissions?
 I have no clue where to look.




 --
 --
 Thanks and Regards,
 Sulabh Choudhury





-- 

-- 
Thanks and Regards,
Sulabh Choudhury
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j with MapReduce inserts

2011-06-17 Thread Michael Hunger
No that would even be worse.

A single BatchInserter  and every graphdb-store that is currently written to by 
a batch inserter MUST be accessed from only a single single threaded 
environment.

Please use the normal EmbeddedGraphDbService for your multi-threaded MR jobs.

Cheers

Michael

Am 17.06.2011 um 23:38 schrieb sulabh choudhury:

 Are you referring that in a M/R environment each Map (or Reduce) process will 
 try to have its own instance of batchInserter and hence it would fail ?
 
 WHen I say local I mean that the code works fine when I just use the M/R 
 api but fails when I try to run in distributed mode.
 
 On Fri, Jun 17, 2011 at 2:25 PM, Michael Hunger 
 michael.hun...@neotechnology.com wrote:
 Hi Sulabh,
 
 what do you mean by 'local' mode?
 
 The batch inserter can only be used in a single threaded environment. You 
 shouldn't use it in a concurrent env as it will fail unpredictably.
 
 Please use the EmbeddedGraphDatabase instead.
 
 Michael
 
 Am 17.06.2011 um 23:20 schrieb sulabh choudhury:
 
 Well as I mentioned the code does not fail anywhere, it runs it full course 
 and just skips the  writing to the graph part.
 I have just one graph and I pass just 1 instance of the batchInserter  to 
 the map function.
 
 My code is in Scala, sample code attached below
 
 
 class ExportReducer extends Reducer[Text,MapWritable,LongWritable,Text]{
 
   type Context = org.apache.hadoop.mapreduce.Reducer[Text, MapWritable, 
 LongWritable, Text]#Context
 
   @throws(classOf[Exception])
   override def reduce(key: Text, value: java.lang.Iterable[MapWritable], 
 context: Context) {
 
   var keys: Array[String] = key.toString.split(:)
var uri1 = first + keys(0)
var uri2 = last + keys(1)
   ExportReducerObject.propertiesUID.put(ID,uri1);
  var node1 = 
 ExportReducerObject.batchInserter.createNode(ExportReducerObject.propertiesUID);
  
 ExportReducerObject.indexService.add(node1,ExportReducerObject.propertiesUID)
   ExportReducerObject.propertiesCID.put(ID,uri2);
  var node2 = 
 ExportReducerObject.batchInserter.createNode(ExportReducerObject.propertiesCID);
  
 ExportReducerObject.indexService.add(node2,ExportReducerObject.propertiesCID);
 
   ExportReducerObject.propertiesEdges.put(fullName,1.0);
   
 ExportReducerObject.batchInserter.createRelationship(node1,node2,DynamicRelationshipType.withName(fullName),ExportReducerObject.propertiesEdges)
 
   }
 
 My graph properties are defined as below :-
 val batchInserter = new BatchInserterImpl(graph, 
 BatchInserterImpl.loadProperties(neo4j.props))
 val indexProvider = new LuceneBatchInserterIndexProvider(batchInserter)
 val indexService = 
 indexProvider.nodeIndex(ID,MapUtil.stringMap(type,exact))
 
 
 Mind it that the code works perfectly( writes to the graph) when running in 
 local mode.
 
 On Fri, Jun 17, 2011 at 11:32 AM, sulabh choudhury sula...@gmail.com wrote:
 I am trying to write MapReduce job to do Neo4j Batchinserters.
 It works fine when I just run it like a java file(runs in local mode) and 
 does the insert, but when I try to run it in the distributed mode it does 
 not write to the graph.
 Is it issue related to permissions? 
 I have no clue where to look.
 
 
 
 -- 
 -- 
 Thanks and Regards,
 Sulabh Choudhury
 
 
 
 
 
 -- 
 -- 
 Thanks and Regards,
 Sulabh Choudhury
 

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j with MapReduce inserts

2011-06-17 Thread sulabh choudhury
Alright thank you all

On Fri, Jun 17, 2011 at 2:46 PM, Michael Hunger 
michael.hun...@neotechnology.com wrote:

 No that would even be worse.

 A single BatchInserter  and every graphdb-store that is currently written
 to by a batch inserter MUST be accessed from only a single single threaded
 environment.

 Please use the normal EmbeddedGraphDbService for your multi-threaded MR
 jobs.

 Cheers

 Michael

 Am 17.06.2011 um 23:38 schrieb sulabh choudhury:

 Are you referring that in a M/R environment each Map (or Reduce) process
 will try to have its own instance of batchInserter and hence it would fail ?

 WHen I say local I mean that the code works fine when I just use the M/R
 api but fails when I try to run in distributed mode.

 On Fri, Jun 17, 2011 at 2:25 PM, Michael Hunger 
 michael.hun...@neotechnology.com wrote:

 Hi Sulabh,

 what do you mean by 'local' mode?

 The batch inserter can only be used in a single threaded environment. You
 shouldn't use it in a concurrent env as it will fail unpredictably.

 Please use the EmbeddedGraphDatabase instead.

 Michael

 Am 17.06.2011 um 23:20 schrieb sulabh choudhury:

 Well as I mentioned the code does not fail anywhere, it runs it full
 course and just skips the  writing to the graph part.
 I have just one graph and I pass just 1 instance of the batchInserter  to
 the map function.

 My code is in Scala, sample code attached below


 class ExportReducer extends Reducer[Text,MapWritable,LongWritable,Text]{

   type Context = org.apache.hadoop.mapreduce.Reducer[Text, MapWritable,
 LongWritable, Text]#Context

   @throws(classOf[Exception])
   override def reduce(key: Text, value: java.lang.Iterable[MapWritable],
 context: Context) {

   var keys: Array[String] = key.toString.split(:)
   var uri1 = first + keys(0)
   var uri2 = last + keys(1)
   ExportReducerObject.propertiesUID.put(ID,uri1);
 var node1 =
 ExportReducerObject.batchInserter.createNode(ExportReducerObject.propertiesUID);

 ExportReducerObject.indexService.add(node1,ExportReducerObject.propertiesUID)
   ExportReducerObject.propertiesCID.put(ID,uri2);
  var node2 =
 ExportReducerObject.batchInserter.createNode(ExportReducerObject.propertiesCID);

 ExportReducerObject.indexService.add(node2,ExportReducerObject.propertiesCID);

   ExportReducerObject.propertiesEdges.put(fullName,1.0);

 ExportReducerObject.batchInserter.createRelationship(node1,node2,DynamicRelationshipType.withName(fullName),ExportReducerObject.propertiesEdges)

   }

 My graph properties are defined as below :-
 val batchInserter = new BatchInserterImpl(graph,
 BatchInserterImpl.loadProperties(neo4j.props))
 val indexProvider = new LuceneBatchInserterIndexProvider(batchInserter)
 val indexService =
 indexProvider.nodeIndex(ID,MapUtil.stringMap(type,exact))


 Mind it that the code works perfectly( writes to the graph) when running
 in local mode.

 On Fri, Jun 17, 2011 at 11:32 AM, sulabh choudhury sula...@gmail.comwrote:

 I am trying to write MapReduce job to do Neo4j Batchinserters.
 It works fine when I just run it like a java file(runs in local mode) and
 does the insert, but when I try to run it in the distributed mode it does
 not write to the graph.
 Is it issue related to permissions?
 I have no clue where to look.




 --
 --
 Thanks and Regards,
 Sulabh Choudhury





 --
 --
 Thanks and Regards,
 Sulabh Choudhury





-- 

-- 
Thanks and Regards,
Sulabh Choudhury
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user