Hi Claudia,

I think we're missing a join the query below. the Itemtobundle table is 
never joined to bundle, which gives us a Cartesian join...
I rearranged the query a bit:

Thanks!

select b.size_bytes
from item i, metadatavalue m, item2bundle i2b, bundle b, 
bundle2bitstream b2b, bitstream bi
where i.in_archive=true
and m.metadata_field_id=11
and b.name='ORIGINAL'
and b.deleted=false
and i.item_id=m.item_id
and i.item_id=i2b.item_id
and i2b.bundle_id = b.bundle_id
and b.bundle_id=b2b.bundle_id
and b2b.bitstream_id=b.bitstream_id;

Claudia Jürgen wrote:
> Hi Susan,
>
> defining the total size of bitstreams ingested is impossible if you 
> have deleted items and/or bitstreams within the same year. You can get 
> the number and size of bitstreams which are at the moment in your 
> system, e.g. belonging to items still in archive and beeing 
> accessioned in a given year.
>
> with
>
> select b.size_bytes from bitstream b, bundle2bitstream b2b, 
> item2bundle i2b, bundle, item i, metadatavalue m
> where i.in_archive=true
> and i.item_id=m.item_id
> and m.metadata_field_id=11
> and m.text_value like '2008%'
> and i.item_id=i2b.item_id
> and bundle.name='ORIGINAL'
> and bundle.bundle_id=b2b.bundle_id
> and b2b.bitstream_id=b.bitstream_id
> and b.deleted=false;
>
> (The use of and instead of joins was just for the sake of readability.)
>
> Notes:
> - in the bundle ORIGINAL (no extracted text, license etc)
> - for items in archive (not withdrawn)
> - bitstreams not deleted
> - accessioned in e.g.2008
> - dc.date.accessioned metadata_field_id=11
>
>
> You can use select count (b.*) ... for the number of bitstreams
> You can use select sum (b.size_bytes) .... for the sum of size_bytes.
>
>
> Hope that helps
>
> Claudia
>
>
> Susan Teague Rector schrieb:
>> Hi all,
>>
>> I just wanted to run this past the Dspace experts - I'm determining 
>> how many files and the total size of files ingested into our Dspace 
>> installation last year.
>> Here's the query I'm running.
>>
>> Since I don't look at the Dspace database on a daily basis, I needed 
>> a sanity check from you folks - Is this is the best way to get at 
>> this data?
>>
>> Thanks in advance,
>>
>> Susan Teague Rector
>> Web Applications Manager
>> VCU Libraries
>>
>>
>> select id.item_id, id.date_issued, ib.bundle_id,
>> b.bitstream_id, b.name, b.size_bytes, b.source
>> from itemsbydate id inner join
>> item2bundle ib on id.item_id = ib.item_id
>> inner join bundle2bitstream bb on ib.bundle_id = bb.bundle_id
>> inner join bitstream b on bb.bitstream_id = b.bitstream_id
>> where name <> 'license.txt'
>> and (source not like '%Written by Media%') or source is null
>> order by id.date_issued, ib.bundle_id;
>>
>> ------------------------------------------------------------------------------
>>  
>>
>> _______________________________________________
>> DSpace-tech mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/dspace-tech 

------------------------------------------------------------------------------
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to