Hi Rohit

I think the problem is that when you download the file from the NCBI you
must ensure that the “Show Sequence” option selected. This will then get the
GenBank file which includes the sequence.

There is a dump of the postgres database with the data in at:
ftp://ftp.sanger.ac.uk/pub/pathogens/workshops/GMOD2009/chado_pathogen_plus_
plasmo.sql_dump.gz

If you need to reload a chromosome, you will need to delete it from the
feature table.

Regards
Tim

On 10/30/09 5:58 PM, "parimi rohit" <rohit.par...@gmail.com> wrote:

> Hi Tim,
> 
> The GenBank format that I downloaded and renamed as *.gbk, do not have any
> sequence data in it. Thus I downloaded the fasta file from the link and saved
> it in the same directory. I used the bp_genbank2gff.pl
> <http://bp_genbank2gff.pl>  to convert them to gff files again. Then I tried
> to load the data into the database using the command given in the tutorial and
> the data is loaded in the same way as it was done previously.
> 
> 
> [ro...@agron-90-78 Chadotesting]$ gmod_bulk_load_gff3.pl
> <http://gmod_bulk_load_gff3.pl>  -organism Pknowlesi -dbname rohit_chado_01
> -dbuser rohit -dbport 5432 -dbpass redearth -recreate_cache <
> NC_011907.gbk.gff
> (Re)creating the uniquename cache in the database...
> Creating table...
> Populating table...
> Creating indexes...Done.
> Preparing data for inserting into the rohit_chado_01 database
> (This may take a while ...)
> Loading data into feature table ...
> Loading data into featureloc table ...
> Loading data into feature_relationship table ...
> Loading data into featureprop table ...
> Skipping feature_cvterm table since the load file is empty...
> Loading data into synonym table ...
> Loading data into feature_synonym table ...
> Loading data into dbxref table ...
> Loading data into feature_dbxref table ...
> Skipping analysisfeature table since the load file is empty...
> Loading data into cvterm table ...
> Loading data into db table ...
> Skipping cv table since the load file is empty...
> Skipping analysis table since the load file is empty...
> Skipping organism table since the load file is empty...
> Adding cvtermprop=MapReferenceType for 'chromosome' ...
> Loading sequences (if any) ...
> Optimizing database (this may take a while) ...
>   (feature featureloc feature_relationship featureprop feature_cvterm synonym
> feature_synonym dbxref feature_dbxref analysisfeature cvterm db cv analysis
> organism ) Done.
> 
> While this script has made an effort to optimize the database, you
> should probably also run VACUUM FULL ANALYZE on the database as well
> 
> 
> I dont understand why some of the tables are skipped. Also, the query that I
> wrote in my previous mail returned the same values with out any sequence
> residues. 
> 
> Also, when I tried to re-load data of an organism into the database again
> using the gmod_bulk_load_gff3.pl <http://gmod_bulk_load_gff3.pl>  script, it
> is not allowing me to do so. It is giving me an error,
> 
> [ro...@agron-90-78 Chadotesting]$ gmod_bulk_load_gff3.pl
> <http://gmod_bulk_load_gff3.pl>  -organism Pknowlesi -dbname rohit_chado_01
> -dbuser rohit -dbport 5432 -dbpass redearth -recreate_cache <
> NC_011907.gbk.gff
> (Re)creating the uniquename cache in the database...
> Creating table...
> Populating table...
> Creating indexes...Done.
> Preparing data for inserting into the rohit_chado_01 database
> (This may take a while ...)
> 
> no parent PKH_060005;
> you probably need to rerun the loader with the --recreate_cache option
> 
> Issuing rollback() due to DESTROY without explicit disconnect() of DBD::Pg::db
> handle dbname=rohit_chado_01;port=5432;host=localhost at
> /usr/local/lib/perl5/site_perl/5.8.9/Bio/GMOD/DB/Adapter.pm line 3882, <STDIN>
> line 3.
> 
> 
> I do not know what the problem is, so if I want to re-load the data, I am
> Installing the chado schema in the database again and loading the ontologies
> again which takes lot of time. Can you tell me why this is happening and what
> the problem is and a solution to this as well.
> 
> I am attaching the gbk files that I am using as well as the gff file. Please
> let me know if there are any mistakes in these files that is causing this
> problem.
> 
> Regards,
> Rohit
> 
> On Tue, Oct 27, 2009 at 3:25 PM, Tim Carver <t...@sanger.ac.uk> wrote:
>> Hi Rohit
>> 
>> Your last query you sent me shows that there are no sequence residues loaded
>> into your database for that sequence:
>> 
>> SELECT    timelastmodified,    f.feature_id AS id,    uniquename,
>>    organism_id AS organismId,    f.is_obsolete AS obsolete,    f.name
>> <http://f.name>  AS feature_name,    f.type_id,    f.dbxref_id AS dbXRefId,
>>    f.seqlen,    residues  FROM feature f WHERE f.feature_id=1;
>>       timelastmodified      | id | uniquename | organismid | obsolete |
>> feature_name | typ
>> e_id | dbxrefid | seqlen  | residues
>> ----------------------------+----+------------+------------+----------+------
>> --------+----
>> -----+----------+---------+----------
>>  2009-10-27 14:11:17.572803 |  1 | NC_004314  |         13 | f        |
>> NC_004314    |    
>>  449 |          | 1687655 |
>> (1 row)
>> 
>> Can you check the files you downloaded have the sequence in them?
>> 
>> Regards
>> Tim
>> 
>> 
>> On 10/27/09 8:09 PM, "parimi rohit" <rohit.par...@gmail.com> wrote:
>> 
>>> Hi Tim,
>>> 
>>> Thank you for the link which describes way to get Atremis working with
>>> chado. I followed the instructions given in the page by downloading the 3
>>> files and modifying them accordingly. Then I loaded them into the data base
>>> using the commands provided in the page. The copied the terminal output in
>>> the file attached in this mail.
>>> 
>>> Then I downloaded the stable release of Artemis again from the link given by
>>> you and executed the following command:
>>> 
>>> ./art -Dchado="localhost:5432/rohit_chado_01?rohit" -Dibatis \
>>>  Pfalciparum:NC_004314
>>> 
>>> 
>>> I have added all the information that I could collect from the log files,
>>> results of queries that were executed in the file that I attached.
>>> At a higher level I understand that for some query the reesult is 0 and
>>> hence it has problem reading null values.
>>> 
>>> But I dont understand why there is a null value as the data is loaded
>>> correctly in the database.
>>> 
>>> Please let me know what the problem is as well as if I should change any of
>>> the configuration files in Artemis inorder to make it work with chado.
>>> 
>>> I did not do any changes to any of the files. I just downloaded the stable
>>> release and ran the command in the terminal.
>>> 
>>> I appreciate your help very much.
>>> 
>>> Regards,
>>> Rohit
>>>  
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Mon, Oct 26, 2009 at 4:02 PM, Tim Carver <t...@sanger.ac.uk> wrote:
>>>> Hi Rohit
>>>> 
>>>> I tried loading that example but with the ‹noexon flag as the output you
>>>> sent suggested but it complains about YAL069W not having a parent. As you
>>>> do
>>>> not get any rows with the query I sent I suspect it has not loaded properly
>>>> and it is not finding the chromosome feature.
>>>> 
>>>> For an example with Artemis have a look at this tutorial:
>>>> 
>>>> 
http://www.sanger.ac.uk/Software/Artemis/v11/chado/GMOD2009SummerSchool.sht>>>>
m
>>>> l#Examples_of_Loading_Sequences_into_the_Database
>>>> <http://www.sanger.ac.uk/Software/Artemis/v11/chado/GMOD2009SummerSchool.sh
>>>> tm%0Al#Examples_of_Loading_Sequences_into_the_Database>
>>>> 
>>>> The query I sent you searches the feature table for entries that have
>>>> residues for Artemis to open up (e.g. chromosome, contig features).
>>>> 
>>>> Regards
>>>> Tim
>>>> 
>>> 
>>> 
>> 
>> 
>> --  The Wellcome Trust Sanger Institute is operated by Genome Research
>> Limited, a charity registered in England with number 1021457 and a  company
>> registered in England with number 2742969, whose registered  office is 215
>> Euston Road, London, NW1 2BE.
>> 
> 


_______________________________________________
Artemis-users mailing list
Artemis-users@sanger.ac.uk
http://lists.sanger.ac.uk/mailman/listinfo/artemis-users

Reply via email to