Hi. It's possible that your original email didn't make it to the list because 
it is HTML format, and the list only accepts plain text.

However, in answer to your two questions:

  1. The code that does the resolution of references might be better if it 
looks up existing IDs rather than using author, title, location to identify 
existing records. I would suggest modifying it to a three-step process - test 
ID, then if no match then test author/title/location, then if still no match 
create a new reference. Could someone do that? (I'm unable to do anything until 
late March).

  2. I think that's a bug (compound locations with null features) but not sure 
why. Could be that the process of constructing a CompoundRichLocation is 
somehow losing the feature reference from the original SimpleRichLocation. 
Again I can't investigate until March - can someone else take a look at the 
code? (A good starting point would be to look at how a CompoundRichLocation 
decides to select the feature from the SimpleRichLocations it is made up from).

cheers,
Richard

On 9 Feb 2010, at 20:21, Deepak Sheoran wrote:

> 
> Hi Richard
> 
> Below is the email which I sent to Biojava-1 mailing list but it never get 
> posted on the mailing list server neither do i got any response, so please 
> have a look on this email and tell what can be the solution of the problem 
> described in the message.
> 
> 
> Thanks
> Deepak Sheoran
> -------- Original Message --------
> Subject:      Hibernate Exception and suggestion for change in BioSqlSchema
> Date: Wed, 03 Feb 2010 08:07:35 -0600
> From: Deepak Sheoran <[email protected]>
> To:   [email protected]
> 
> Hi guys,
> 
> A couple of days back I was having some problem with hibernate exception but 
> that exception got resolved and the reference to that email is:  
> http://old.nabble.com/Hibernate-Exception-when-persisting-some-richsequence-object-to-biosql-schema-to27299245.html
> On Richard  suggestion in above link  I am able to resolve some of  issues 
> but then, I got stuck in to some other error with hibernate and then decided 
> to investigate the matter and below are some facts and information which I 
> found and I guess it is going to affect all of us.
>       • The "Reference" table in bioSql schema have unique constraint on 
> "dbxref_id" column (CONSTRAINT reference_dbxref_id_key UNIQUE (dbxref_id)). 
> Which mean only one entry in reference table can use on dbxref_id.
> This Works wells but in cases when you have little variation in value of 
> following column "location", "title", "authors" and all these variation 
> refers to same PUBMED_ID. Then we can't persist or create a richsequence 
> object .
>  Now when you tie RichObjectFactory to a  active hibernate session then the 
> class  "BioSqlRichObjectBuilder" have method called "buildObject(Class clazz, 
> List paramsList) " which is responsible  for looking up details of object in 
> the database and if it find one then it will return that object, else it will 
> try to persist the new object into the database.
> But problem is with below part of that method:
> …..LineNumber: 114
> else if (SimpleDocRef.class.isAssignableFrom(clazz))
>  {                queryType = "DocRef";
>                 // convert List constructor to String representation for query
>                 ourParamsList.set(0, 
> DocRefAuthor.Tools.generateAuthorString((List)ourParamsList.get(0), true));
>                 if (ourParamsList.size()<3) {
>                         queryText = "from DocRef as cr where cr.authors = ? 
> and cr.location = ? and cr.title is null";
>                 } else {
>                         queryText = "from DocRef as cr where cr.authors = ? 
> and cr.location = ? and cr.title = ?";
>                 }       
>  }
> ..LineNubmer: 123
> Now when hibernate search the database, it won't find any other record in 
> "reference" table because those two record are different in string 
> comparison, so it will return a new object back to "GenbankFormat" to 
> following piece of code
> ….LineNumber: 447
> else {
>                                         try {
>                                             CrossRef cr = 
> (CrossRef)RichObjectFactory.getObject(SimpleCrossRef.class,new 
> Object[]{dbname, raccession, new Integer(0)});
>                                             RankedCrossRef rcr = new 
> SimpleRankedCrossRef(cr, ++rcrossrefCount);
>                                             
> rlistener.getCurrentFeature().addRankedCrossRef(rcr);
>                                         } catch (ChangeVetoException e) {
>                                             throw new ParseException(e+", 
> accession:"+accession);
>                                         }
>                                     }
>                     …..LineNumber:455
> Then we will add that object to rlistener. And move to next part of genbank 
> record and then biojava search for a new crossref in database and it will try 
> to persist the old one it get a hibernate exception regarding violation of  
> "unique constraint on dbxref_id" column.
>  
> The only way to get these record in database is:
>               • The very easy solution and the way I did it for testing my 
> theory is Change the bioSql schema so that it can allow many to one on 
> relation between "reference" and "dbxref" table.  Which even make sense 
> because one paper can have many different variation of naming, and this 
> change allow us to store that info too. But this is something BioSql people 
> have decide and I don't know how to approach them.
>               • Second solution is slightly difficult to implement, is to 
> change the way  "BioSqlRichObjectBuilder.buildObject(Class clazz,List 
> paramsList)"  make decision about weather a particular DocRef already exist 
> in database or not. I am mean testing all possible string variations of 
> authors, location, title of the docRef which we are searching. Which does 
> have many complications and may slow down process of creating a richsequence 
> object when link RichObjectFactory with a active hibernate session.
>  
> Example:Below is a sample of what i have in my local biosql schema which has 
> modification suggested by me. (dbxref_id column have Pubmed_id , I replaced 
> the local dbxref_id which was present on this table in my database with 
> pubmed_id stored in "dbxref" table, for easy reference with outside world in 
> this email)
> Reference_id
> Dbxref_id         
> Location
> Title
> Authors
> crc
> 216
> 18554304
> FEMS Microbiol. Ecol. 66 (3THEMATIC ISSUE: GUT MICROBIOLOGY), 528-536 (2008)
> Isolation of lactate-utilizing butyrate-producing bacteria from human feces 
> and in vivo administration of Anaerostipes caccae strain L2 and 
> galacto-oligosaccharides in a rat model
> Sato,T., Matsumoto,K., Okumura,T., Yokoi,W., Naito,E., Yoshida,Y., Nomoto,K., 
> Ito,M. and Sawada,H.
> 9E940E01F4BE3CD0
> 230
> 18554304
> FEMS Microbiol. Ecol. 66 (3), 528-536 (2008)
> Isolation of lactate-utilizing butyrate-producing bacteria from human feces 
> and in vivo administration of Anaerostipes caccae strain L2 and 
> galacto-oligosaccharides in a rat model
> Sato,T., Matsumoto,K., Okumura,T., Yokoi,W., Naito,E., Yoshida,Y., Nomoto,K., 
> Ito,M. and Sawada,H.
> D3BC0C17F3F786C9
> 415
> 16790744
> Infect. Immun. 74 (7), 3715-3726 (2006)
> Intrastrain Heterogeneity of the mgpB Gene in Mycoplasma genitalium Is 
> Extensive In Vitro and In Vivo and Suggests that Variation Is Generated via 
> Recombination with Repetitive Chromosomal Sequences
> Iverson-Cabral,S.L., Astete,S.G., Cohen,C.R., Rocha,E.P. and Totten,P.A.
> 60AEDFA0CEEACC38
> 969
> 16790744
> Infect. Immun. 74 (7), 3715-3726 (2006)
> Intrastrain heterogeneity of the mgpB gene in mycoplasma genitalium is 
> extensive in vitro and in vivo and suggests that variation is generated via 
> recombination with repetitive chromosomal sequences
> Iverson-Cabral,S.L., Astete,S.G., Cohen,C.R., Rocha,E.P. and Totten,P.A.
> 4B1232999F6E8130
> 929
> 8688087
> Science 273 (5278), 1058-1073 (1996)
> Complete genome sequence of the methanogenic archaeon, Methanococcus 
> jannaschii
> Bult,C.J., White,O., Olsen,G.J., Zhou,L., Fleischmann,R.D., Sutton,G.G., 
> Blake,J.A., FitzGerald,L.M., Clayton,R.A., Gocayne,J.D., Kerlavage,A.R., 
> Dougherty,B.A., Tomb,J.-F., Adams,M.D., Reich,C.I., Overbeek,R., 
> Kirkness,E.F., Weinstock,K.G., Merrick,J.M., Glodek,A., Scott,J.L., 
> Geoghagen,N.S.M., Weidman,J.F., Fuhrmann,J.L., Presley,E.A., Nguyen,D., 
> Utterback,T.R., Kelley,J.M., Peterson,J.D., Sadow,P.W., Hanna,M.C., 
> Cotton,M.D., Hurst,M.A., Roberts,K.M., Kaine,B.P., Borodovsky,M., 
> Klenk,H.-P., Fraser,C.M., Smith,H.O., Woese,C.R. and Venter,J.C.
> 3E79B40DD2AAA2B7
> 932
> 8688087
> Science 273 (5278), 1058-1073 (1996)
> Complete genome sequence of the methanogenic archaeon, Methanococcus 
> jannaschii
> Bult,C.J., White,O., Olsen,G.J., Zhou,L., Fleischmann,R.D., Sutton,G.G., 
> Blake,J.A., FitzGerald,L.M., Clayton,R.A., Gocayne,J.D., Kerlavage,A.R., 
> Dougherty,B.A., Tomb,J., Adams,M.D., Reich,C.I., Overbeek,R., Kirkness,E.F., 
> Weinstock,K.G., Merrick,J.M., Glodek,A., Scott,J.D., Geoghagen,N.S., 
> Weidman,J.F., Fuhrmann,J.L., Nguyen,D.T., Utterback,T., Kelley,J.M., 
> Peterson,J.D., Sadow,P.W., Hanna,M.C., Cotton,M.D., Hurst,M.A., Roberts,K.M., 
> Kaine,B.B., Borodovsky,M., Klenk,H.P., Fraser,C.M., Smith,H.O., Woese,C.R. 
> and Venter,J.C.
> 094EB3384F8D6DE8
> 1426
> 10684935
> Nucleic Acids Res. 28 (6), 1397-1406 (2000)
> Genome sequences of Chlamydia trachomatis MoPn and Chlamydia pneumoniae AR39
> Read,T.D., Brunham,R.C., Shen,C., Gill,S.R., Heidelberg,J.F., White,O., 
> Hickey,E.K., Peterson,J., Umayam,L.A., Utterback,T., Berry,K., Bass,S., 
> Linher,K., Weidman,J., Khouri,H., Craven,B., Bowman,C., Dodson,R., Gwinn,M., 
> Nelson,W., DeBoy,R., Kolonay,J., McClarty,G., Salzberg,S.L., Eisen,J. and 
> Fraser,C.M.
> 357648D8FD8C6C8A
> 1481
> 10684935
> Nucleic Acids Res. 28 (6), 1397-1406 (2000)
> Genome sequences of Chlamydia trachomatis MoPn and Chlamydia pneumoniae AR39
> Read,T., Brunham,R., Shen,C., Gill,S., Heidelberg,J., White,O., Hickey,E., 
> Peterson,J., Utterback,T., Berry,K., Bass,S., Linher,K., Weidman,J., 
> Khouri,H., Craven,B., Bowman,C., Dodson,R., Gwinn,M., Nelson,W., DeBoy,R., 
> Kolonay,J., McClarty,G., Salzberg,S., Eisen,J. and Fraser,C.
> 115411EB2DEE5654
> 1497
> 14689165
> Arch. Microbiol. 181 (2), 144-154 (2004)
> The effect of FITA mutations on the symbiotic properties of Sinorhizobium 
> fredii varies in a chromosomal-background-dependent manner
> Vinardell,J.M., Lopez-Baena,F.J., Hidalgo,A., Ollero,F.J., Bellogin,R., del 
> Rosario Espuny,M., Temprano,F., Romero,F., Krishnan,H.B., Pueppke,S.G. and 
> Ruiz-Sainz,J.E.
> 4D5D376EECCD186B
> 1501
> 14689165
> Arch. Microbiol. 181 (2), 144-154 (2004)
> The effect of FITA mutations on the symbiotic properties of Sinorhizobium 
> fredii varies in a chromosomal-background-dependent manner
> Vinardell,J.M., Lopez-Baena,F.J., Hidalgo,A., Ollero,F.J., Bellogin,R., Del 
> Rosario Espuny,M., Temprano,F., Romero,F., Krishnan,H.B., Pueppke,S.G. and 
> Ruiz-Sainz,J.E.
> 4D57954EECDED66B
> 1556
> 18060065
> PLoS ONE 2 (12), E1271 (2007)
> Analysis of the Neurotoxin Complex Genes in Clostridium botulinum A1-A4 and 
> B1 Strains: BoNT/A3, /Ba4 and /B1 Clusters Are Located within Plasmids
> Smith,T.J., Hill,K.K., Foley,B.T., Detter,J.C., Munk,A.C., Bruce,D.C., 
> Doggett,N.A., Smith,L.A., Marks,J.D., Xie,G. and Brettin,T.S.
> 698688FB6DB95247
> 1559
> 18060065
> PLoS ONE 2 (12), E1271 (2007)
> Analysis of the neurotoxin complex genes in Clostridium botulinum A1-A4 and 
> B1 strains: BoNT/A3, /Ba4 and /B1 clusters are located within plasmids
> Smith,T.J., Hill,K.K., Foley,B.T., Detter,J.C., Munk,C.A., Bruce,D.C., 
> Doggett,N.A., Smith,L.A., Marks,J.D., Xie,G. and Brettin,T.S.
> E25E1BA99DB18F3D
>  
>       • The second kind of error which I got was : 
> org.hibernate.PropertyValueException: not-null property references a null or 
> transient value: Location.feature
>               • Which means in richsequence object some feature have location 
> object which have its feature set to null.
>               • My Observation:
>                       • Usually occur when you try to persist a richsequence 
> object to database, and occur to those features which have 
> CompoundRichLocation usually "joins" and "complement" in cds region of a 
> genbank record
>                       • After catching the hibernate exception I went through 
> all the features and either biojava or hibernate  changed the object type of 
> a CompoundRichLocation  to SimpleRichLocation and set the feature variable to 
> null.
>                       • Below is the screen shot of one of my tests
>                               • Settings before trying to persits the 
> richsequence object to database
>  
> <Mail Attachment.png>
>               •  
>               • After trying to persits the richsequence object to database 
> and got in hibernate exception catch
>  
>               • <Mail Attachment.png>
>  
>               • So my question is why is this happening and how to stop or 
> how to get these record into database, I have no clue why is this happening.
>               • Some extra information to make things more clear to you guys.
>                       • Below are some Locus line from genbank record for 
> which I know the error of location, I mean the cds region causing error, and 
> array index in richsequence.feature arrayList object.
>                               • LOCUS       AE001439             1643831 bp   
>  DNA     circular BCT 19-JAN-2006
>                                       • richSequence.feature Index : 2540 and 
> line number in the genbank record : 22115
>                               • LOCUS       CP001189             3887492 bp   
>  DNA     circular BCT 16-OCT-2008
>                                       • richSequence.feature Index : 127 and 
> line number in the genbank record : 2137
>                               • LOCUS       CP001292              328635 bp   
>  DNA     circular BCT 17-DEC-2008
>                                       • richSequence.feature Index : 389 and 
> line number in the genbank record : 3632
>                               • LOCUS       AM279694              238517 bp   
>  DNA     linear   BCT 23-OCT-2008
>                                       • richSequence.feature Index : 47 and 
> line number in the genbank record : 4841
>                               • LOCUS       CR931663               18517 bp   
>  DNA     linear   BCT 18-SEP-2008
>                                       • richSequence.feature Index : 45 and 
> line number in the genbank record : 442
>               • The complete exception msg :
> org.hibernate.PropertyValueException: not-null property references a null or 
> transient value: Location.feature
>         at 
> org.hibernate.engine.Nullability.checkNullability(Nullability.java:72)
>         at 
> org.hibernate.event.def.AbstractSaveEventListener.performSaveOrReplicate(AbstractSaveEventListener.java:290)
>         at 
> org.hibernate.event.def.AbstractSaveEventListener.performSave(AbstractSaveEventListener.java:181)
>         at 
> org.hibernate.event.def.AbstractSaveEventListener.saveWithGeneratedId(AbstractSaveEventListener.java:121)
>         at 
> org.hibernate.event.def.DefaultSaveOrUpdateEventListener.saveWithGeneratedOrRequestedId(DefaultSaveOrUpdateEventListener.java:187)
>         at 
> org.hibernate.event.def.DefaultSaveOrUpdateEventListener.entityIsTransient(DefaultSaveOrUpdateEventListener.java:172)
>         at 
> org.hibernate.event.def.DefaultSaveOrUpdateEventListener.performSaveOrUpdate(DefaultSaveOrUpdateEventListener.java:94)
>         at 
> org.hibernate.event.def.DefaultSaveOrUpdateEventListener.onSaveOrUpdate(DefaultSaveOrUpdateEventListener.java:70)
>         at 
> org.hibernate.impl.SessionImpl.fireSaveOrUpdate(SessionImpl.java:507)
>         at org.hibernate.impl.SessionImpl.saveOrUpdate(SessionImpl.java:499)
>         at 
> org.hibernate.engine.CascadingAction$5.cascade(CascadingAction.java:218)
>         at org.hibernate.engine.Cascade.cascadeToOne(Cascade.java:268)
>         at org.hibernate.engine.Cascade.cascadeAssociation(Cascade.java:216)
>         at org.hibernate.engine.Cascade.cascadeProperty(Cascade.java:169)
>         at 
> org.hibernate.engine.Cascade.cascadeCollectionElements(Cascade.java:296)
>         at org.hibernate.engine.Cascade.cascadeCollection(Cascade.java:242)
>         at org.hibernate.engine.Cascade.cascadeAssociation(Cascade.java:219)
>         at org.hibernate.engine.Cascade.cascadeProperty(Cascade.java:169)
>         at org.hibernate.engine.Cascade.cascade(Cascade.java:130)
>         at 
> org.hibernate.event.def.AbstractSaveEventListener.cascadeAfterSave(AbstractSaveEventListener.java:456)
>         at 
> org.hibernate.event.def.AbstractSaveEventListener.performSaveOrReplicate(AbstractSaveEventListener.java:334)
>         at 
> org.hibernate.event.def.AbstractSaveEventListener.performSave(AbstractSaveEventListener.java:181)
>         at 
> org.hibernate.event.def.AbstractSaveEventListener.saveWithGeneratedId(AbstractSaveEventListener.java:121)
>         at 
> org.hibernate.event.def.DefaultSaveOrUpdateEventListener.saveWithGeneratedOrRequestedId(DefaultSaveOrUpdateEventListener.java:187)
>         at 
> org.hibernate.event.def.DefaultSaveOrUpdateEventListener.entityIsTransient(DefaultSaveOrUpdateEventListener.java:172)
>         at 
> org.hibernate.event.def.DefaultSaveOrUpdateEventListener.performSaveOrUpdate(DefaultSaveOrUpdateEventListener.java:94)
>         at 
> org.hibernate.event.def.DefaultSaveOrUpdateEventListener.onSaveOrUpdate(DefaultSaveOrUpdateEventListener.java:70)
>         at 
> org.hibernate.impl.SessionImpl.fireSaveOrUpdate(SessionImpl.java:507)
>         at org.hibernate.impl.SessionImpl.saveOrUpdate(SessionImpl.java:499)
>         at 
> org.hibernate.engine.CascadingAction$5.cascade(CascadingAction.java:218)
>         at org.hibernate.engine.Cascade.cascadeToOne(Cascade.java:268)
>         at org.hibernate.engine.Cascade.cascadeAssociation(Cascade.java:216)
>         at org.hibernate.engine.Cascade.cascadeProperty(Cascade.java:169)
>         at 
> org.hibernate.engine.Cascade.cascadeCollectionElements(Cascade.java:296)
>         at org.hibernate.engine.Cascade.cascadeCollection(Cascade.java:242)
>         at org.hibernate.engine.Cascade.cascadeAssociation(Cascade.java:219)
>         at org.hibernate.engine.Cascade.cascadeProperty(Cascade.java:169)
>         at org.hibernate.engine.Cascade.cascade(Cascade.java:130)
>         at 
> org.hibernate.event.def.AbstractSaveEventListener.cascadeAfterSave(AbstractSaveEventListener.java:456)
>         at 
> org.hibernate.event.def.AbstractSaveEventListener.performSaveOrReplicate(AbstractSaveEventListener.java:334)
>         at 
> org.hibernate.event.def.AbstractSaveEventListener.performSave(AbstractSaveEventListener.java:181)
>         at 
> org.hibernate.event.def.AbstractSaveEventListener.saveWithGeneratedId(AbstractSaveEventListener.java:121)
>         at 
> org.hibernate.event.def.DefaultSaveOrUpdateEventListener.saveWithGeneratedOrRequestedId(DefaultSaveOrUpdateEventListener.java:187)
>         at 
> org.hibernate.event.def.DefaultSaveEventListener.saveWithGeneratedOrRequestedId(DefaultSaveEventListener.java:33)
>         at 
> org.hibernate.event.def.DefaultSaveOrUpdateEventListener.entityIsTransient(DefaultSaveOrUpdateEventListener.java:172)
>         at 
> org.hibernate.event.def.DefaultSaveEventListener.performSaveOrUpdate(DefaultSaveEventListener.java:27)
>         at 
> org.hibernate.event.def.DefaultSaveOrUpdateEventListener.onSaveOrUpdate(DefaultSaveOrUpdateEventListener.java:70)
>         at org.hibernate.impl.SessionImpl.fireSave(SessionImpl.java:535)
>         at org.hibernate.impl.SessionImpl.save(SessionImpl.java:523)
>         at 
> trashtesting.GenBankLoaderTesting.main(GenBankLoaderTesting.java:78)
>  
>  

--
Richard Holland, BSc MBCS
Operations and Delivery Director, Eagle Genomics Ltd
T: +44 (0)1223 654481 ext 3 | E: [email protected]
http://www.eaglegenomics.com/


_______________________________________________
Biojava-l mailing list  -  [email protected]
http://lists.open-bio.org/mailman/listinfo/biojava-l

Reply via email to