Regarding my previous mail I found this thread
http://www.bytebucket.org/galaxy/galaxy-central/pull-request/175/parameter-based-bam-file-parallelization/diff

is it still alive? is it maybe the best choice to do the bam
parallelization?

Thanks!
Best regards

On 23 April 2015 at 17:55, Roberto Alonso CIPF <ralo...@cipf.es> wrote:

> Hello,
> I ma trying ti write some code in order to give the possibility of
> parallelize some tasks. Now, I was with the problem of splitting a bam in
> some parts, for this I create this simple tool
>
> <parallelism method="multi" split_size="3" split_mode="number_of_parts"
> merge_outputs="output" split_inputs="input" ></parallelism>
>
>   <command>
>     java -jar
> /home/ralonso/software/GenomeAnalysisTK-3.3-0/GenomeAnalysisTK.jar -T
> UnifiedGenotyper -R /home/ralonso/BiB/Galaxy/data/chr_19_hg19_ucsc.fa -I
> $input -o $output 2&gt; /dev/null;
>
>   </command>
>   <inputs>
>     <param format="bam" name="input" type="data" label="bam"/>
>   </inputs>
>   <outputs>
>       <data format="vcf" name="output" />
>   </outputs>
>
> But I have one problem, when I execute the tool it goes through this part
> of code (I am working in dev branch):
>
> *$galaxy/lib/galaxy/jobs/splitters/multi.py, line 75:*
>
>     for input in parent_job.input_datasets:
>         if input.name in split_inputs:
>             this_input_files =
> job_wrapper.get_input_dataset_fnames(input.dataset)
>             if len(this_input_files) > 1:
>                 log_error = "The input '%s' is composed of multiple files
> - splitting is not allowed" % str(input.name)
>                 log.error(log_error)
>                 raise Exception(log_error)
>             input_datasets.append(input.dataset)
>
> So, it is raising the exception because this_input_files=2, concretely:
> ['/home/ralonso/galaxy/database/files/000/dataset_171.dat',
> '/home/ralonso/galaxy/database/files/_metadata_files/000/metadata_13.dat'],
> I guess that:
> *dataset_171.dat*: It is the bam file.
> *metadata_13.dat*: It is the bai file.
>
> So, Galaxy can't move on and I don't know which would be the best
> solution. Maybe change the *if* to check only non-metadata files? I think
> I should use both files in order to create the bam sub-files, but this
> would be inside the Bam class, under *binary.py* file.
> Could you please guide me before I mess things up?
>
> Thanks so much
> --
> Roberto Alonso
> Functional Genomics Unit
> Bioinformatics and Genomics Department
> Prince Felipe Research Center (CIPF)
> C./Eduardo Primo Yúfera (Científic), nº 3
> (junto Oceanografico)
> 46012 Valencia, Spain
> Tel: +34 963289680 Ext. 1021
> Fax: +34 963289574
> E-Mail: ralo...@cipf.es
>



-- 
Roberto Alonso
Functional Genomics Unit
Bioinformatics and Genomics Department
Prince Felipe Research Center (CIPF)
C./Eduardo Primo Yúfera (Científic), nº 3
(junto Oceanografico)
46012 Valencia, Spain
Tel: +34 963289680 Ext. 1021
Fax: +34 963289574
E-Mail: ralo...@cipf.es
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Reply via email to