subject:"\[galaxy\-dev\] Problems with DataImport"

Re: [galaxy-dev] Problems with DataImport (fixed)

2011-12-31 Thread Ted Goldstein

This bug irritated me, so I fixed it.  Essentially add_file() in  upload.py is 
not in on the joke that local dirs are relative paths and need the absolute 
path tacked onto it.
Is there a written process on how to submit the fix?   I could not find it.
Thanks,
Ted


diff -r 21b645303c02 tools/data_source/upload.py
--- a/tools/data_source/upload.py   Thu Dec 22 13:54:33 2011 -0500
+++ b/tools/data_source/upload.py   Sat Dec 31 15:29:45 2011 -0800
@@ -74,7 +74,10 @@
 id, files_path, path = arg.split( ':', 2 )
 rval[int( id )] = ( path, files_path )
 return rval
-def add_file( dataset, registry, json_file, output_path ):
+
+import pdb
+
+def add_file( dataset, registry, json_file, output_path, root_dir):
 data_type = None
 line_count = None
 converted_path = None
@@ -94,7 +97,10 @@
 file_err( 'Unable to fetch %s\n%s' % ( dataset.path, str( e ) ), 
dataset, json_file )
 return
 dataset.path = temp_name
-# See if we have an empty file
+
+if dataset.type == 'server_dir' and not os.path.isabs( dataset.path):
+   dataset.path = os.path.join( root_dir, dataset.path )
+   
 if not os.path.exists( dataset.path ):
 file_err( 'Uploaded temporary file (%s) does not exist.' % 
dataset.path, dataset, json_file )
 return
@@ -384,7 +390,7 @@
 files_path = output_paths[int( dataset.dataset_id )][1]
 add_composite_file( dataset, registry, json_file, output_path, 
files_path )
 else:
-add_file( dataset, registry, json_file, output_path )
+add_file( dataset, registry, json_file, output_path , sys.argv[1])
 # clean up paramfile
 try:
 os.remove( sys.argv[3] )
[ted@tap galaxy-central]$ !v





On Dec 29, 2011, at 1:22 AM, Ted Goldstein wrote:

 Hi there, 
 Here are three interrelated issues.
 
 I am trying to use Galaxy with some large cancer genomic datasets here at 
 UCSC and do some systems biology.I have petabyte size dataset  data 
 libraries which will constantly be in flux at the edges.  I would prefer to 
 just have the Galaxy read the metadata from the file system for  large 
 datasets without using the database. Is there a convenient api boundary to 
 write an adapter to the dataset object interface?
 
 In the meantime, I am going to try to just import day using the link. Its 
 great that this feature is in already When I import into a couple of a modest 
 megabyte size dataset using  Link to files without copying to Galaxy  
 option, the status never changes from queued.  Is this a bug? Is there a 
 known work around? I have many large datasets.
 
 Also, it takes a long time to expand the dataset name link.  (My experiment 
 on import is a data tree of about a thousand files).  Is this a known bug?
 
 Thanks!
 Ted
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Problems with DataImport

2011-12-29 Thread Ted Goldstein

Hi there, 
Here are three interrelated issues.

I am trying to use Galaxy with some large cancer genomic datasets here at UCSC 
and do some systems biology.I have petabyte size dataset  data libraries 
which will constantly be in flux at the edges.  I would prefer to just have the 
Galaxy read the metadata from the file system for  large datasets without using 
the database. Is there a convenient api boundary to write an adapter to the 
dataset object interface?

In the meantime, I am going to try to just import day using the link. Its great 
that this feature is in already When I import into a couple of a modest 
megabyte size dataset using  Link to files without copying to Galaxy  option, 
the status never changes from queued.  Is this a bug? Is there a known work 
around? I have many large datasets.

Also, it takes a long time to expand the dataset name link.  (My experiment on 
import is a data tree of about a thousand files).  Is this a known bug?

Thanks!
Ted
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Problems with DataImport (fixed)

[galaxy-dev] Problems with DataImport

2 matches

Site Navigation

Mail list logo

Footer information