Ahh! That's my bad! Sorry!

I corrected it and ran it again. But I still get some taxa that are the
same but have multiple nodes. The updated code is available here
<https://github.com/sunitj/SuperMoM/blob/master/IMG/createDB.pl> (lines:
368-393). Here is the snippet:

Also, what's the difference between how I'm creating nodes and using the
create_unique function? Aside from maybe saving me a few lines?

I was trying to use the function, but wasn't able to figure out what the
first and second arguments were. I couldn't find it on the blog I mentioned
above and in your slides they were both "name=>$pkg" or similar.

--
Sunit Jain
Research Computing Specialist -- Bioinformatics
Michigan Geomicrobiology Lab
Dept. of Earth & Environmental Sciences,
University of Michigan,
Ann Arbor, MI, USA.
email: [email protected]
web: www.sunitjain.com
meet: www.sunitjain.com/contact

On Wed, Mar 18, 2015 at 10:10 PM, Mark Jensen <[email protected]>
wrote:

> Thanks Sunit --
> I'll think your problem is the difference highlighted in the code below.
> You're looking for the species with the key 'name', but adding to the index
> with key 'id'.
>
> You may find the $idx->create_unique()
> <https://metacpan.org/pod/REST::Neo4p::Index#create_unique>method helpful
> too.
> MAJ
>
> if ($PhyloDist{$gene}{"DOMAIN"}) {
>   my $species=$PhyloDist{$gene}{"SPECIES"};
>   ($taxa_nodes{$gene})= $idx->find_entries(name=>$species);
>   unless ($taxa_nodes{$gene}) {
>     $taxa_nodes{$gene}=REST::Neo4p::Node->new({id=>$PhyloDist{$gene}{
> "SPECIES"}});
>     $taxa_nodes{$gene}->set_labels("Taxa");
>     foreach (keys %{$PhyloDist{$gene}}){
>       next if $_ eq "SPECIES";
>       next if $_ eq "PERCENT";
>       my $value=lc($PhyloDist{$gene}{$_});
>       my $key=lc($_);
>       $taxa_nodes{$gene}->set_property({$key=>$value});
>     }
>     $idx->add_entry($taxa_nodes{$gene}, id=>$species);
>   }
> ...
> }
>
>
> On Wednesday, March 18, 2015 at 5:30:15 PM UTC-4, Sunit Jain wrote:
>>
>> First, congratulations on creating such a great perl driver for Neo4j. I
>> really appreciate the work you must have put into it.
>>
>> I've been trying to use this driver to create a database for our
>> meta*omic data. I was successfully able to put together some perl code by
>> following some slides <http://www.slideshare.net/majensen1/dcpm-meetup>,
>> the neo4j blog post <http://neo4j.com/blog/restneo4p-a-perl-ogm/> about
>> this driver and the MetaCPAN <https://metacpan.org/pod/REST::Neo4p>
>> description. However I'm getting stuck at a point where I'm no longer sure
>> what's going on. I'm hoping you might be able to help.
>>
>> *As a side note, the example on the neo4j blog
>> <http://neo4j.com/blog/restneo4p-a-perl-ogm/> seemed very limited and about
>> 2yr old, is there a more recent version somewhere? Maybe one with best
>> practices? If not, I'd be happy to start one explaining what I did for my
>> current project, once I have at least one successful run. I**t won't be
>> as insightful, but it'll be something.*
>>
>> *Goal:*
>> Create unique Taxa nodes, have the gene locus that belong to the Taxa
>> relate to it with an "IN_ORGANISM" relationship:
>>
>> (Taxa)<-[: IN_ORGANISM]-(Locus)
>>
>>
>> More details can be found in createDB.pl (lines: 326-352), here
>> <https://github.com/sunitj/SuperMoM/tree/master/IMG>
>>
>> *Issue:*
>> Here is the perl snippet of my code to create unique 'Taxa' nodes:
>> [image: Inline image 1]
>>
>> Perl snippet to create unique relations to Taxa:
>> [image: Inline image 2]
>>
>> When I run this script, it creates the exact same taxa node 94 times! I
>> did a quick grep in my CSV to find that there were 94 instances of that
>> taxa. So, the script essentially created a new node each time it
>> encountered a species. I also created some scaffold, locii, COG, PFam and
>> Project nodes much the same way but only unique nodes were created in all
>> the other instances. The only difference was that the property "id" was
>> "$species" which is a text value with spaces in case of Taxa but for all
>> others it was an alphanumeric without spaces, but I don't see how this
>> could affect the outcome.
>>
>> I apologize for the lengthy email.
>>
>> ================
>> Linux RHEL Server 6.5
>> Perl 5.18
>> Neo4j 2.1.7
>> Java 1.7
>> ================
>> --
>> Sunit Jain
>> Research Computing Specialist -- Bioinformatics
>> Michigan Geomicrobiology Lab
>> Dept. of Earth & Environmental Sciences,
>> University of Michigan,
>> Ann Arbor, MI, USA.
>> web: www.sunitjain.com
>> meet: www.sunitjain.com/contact
>>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "Neo4j" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/neo4j/QXep2b3ncMs/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to