Hi Mark,
The media-filter supports a specified skip-list.
(An Example)
0 2 * * * nice /dspace/bin/dspace filter-media -s
1811/28447,1811/28465,1811/52046 -a >
/home/dspace/log-output/filter-media.latest.log
Full Syntax:
peterdietz:osulibrariesDSpace peterdietz$ /dspace/bin/dspace filter-media
--help
usage: MediaFilterManager
-p,--plugins ONLY run the specified Media Filter plugin(s)
listed from 'filter.plugins' in dspace.cfg.
Separate multiple with a comma (,)
(e.g. MediaFilterManager -p
"Word Text Extractor","PDF Text Extractor")
-s,--skip SKIP the bitstreams belonging to identifier
Separate multiple identifiers with a comma (,)
(e.g. MediaFilterManager -s
123456789/34,123456789/323)
-f,--force force all bitstreams to be processed
-h,--help help
-i,--identifier ONLY process bitstreams belonging to identifier
-m,--maximum process no more than maximum items
-n,--noindex do NOT update the search index after filtering
bitstreams
-q,--quiet do not print anything except in the event of errors.
-v,--verbose print all extracted text and other details to STDOUT
Peter Dietz
On Mon, Mar 11, 2013 at 3:21 PM, helix84 <[email protected]> wrote:
> On Mon, Mar 11, 2013 at 8:07 PM, Mark Ludwig <[email protected]> wrote:
> > Basically, we don't need this kind of item indexed.
> > Is there a way to turn off indexing of an item
> > or a file within an item? Are there any other
> > ways to deal with this?
>
> Hi Mark,
>
> yes, simply upload a file with the same name (.txt) into the TEXT
> bundle. Once that file exists (it can be even empty), the media filter
> won't try to recreate it.
>
> > Another thing about this is that DSpace never frees up the
> > space after deleting items. I'd really like my
> > Gbytes back when I delete large items.
> > It looks like you regain index space after filter-media
> > re-runs, but the gigage in assetstore is wasted.
>
> That's also a "feature". Simply run "[dspace]/bin/dspace cleanup" to
> get rid of all the files marked as deleted.
>
>
> Regards,
> ~~helix84
>
> Compulsory reading: DSpace Mailing List Etiquette
> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>
>
> ------------------------------------------------------------------------------
> Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
> Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the
> endpoint security space. For insight on selecting the right partner to
> tackle endpoint security challenges, access the full report.
> http://p.sf.net/sfu/symantec-dev2dev
> _______________________________________________
> DSpace-tech mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
> List Etiquette:
> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>
------------------------------------------------------------------------------
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the
endpoint security space. For insight on selecting the right partner to
tackle endpoint security challenges, access the full report.
http://p.sf.net/sfu/symantec-dev2dev
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette