You may have to run filter-media with the -f option, but you can get
extracted text back, yes.  

B--

>>> On 2/10/2012 at 12:00 PM, in message
<CAGD4+EvbL_OQudnpznJ+C=lxbacrpugaaz9o7zej0uf2urw...@mail.gmail.com>,
Paul Go
<[email protected]> wrote:
> What would happen if I deleted all of the Extracted Text files for
the
> items that have redacted material?  Will new extracted text files be
> generated the next time the jobs are run?
> 
> Paul Go
> 
> Systems Librarian /
> Library Technology Manager
> Paul V. Galvin Library
> 35 West 33rd Street
> Chicago, IL  60616
> 312.567.7997
> [email protected] 
> 
> 
> 
> On Fri, Feb 10, 2012 at 12:52 PM, Brian Freels-Stendel
<[email protected]>wrote:
> 
>> Hi Paul,
>>
>> I'm going to hope that DSIndexer can do the trick.
>>
>> If not, though, there is still a last-resort, manual option.  Signed
in
>> as an administrator, you can edit item bitstreams individually. 
The
>> extracted texts will show up under the TEXT Bundle, under the name
>> [originalFileName].txt.  If your new files are named the same as the
old
>> ones were, you may see two .txt files with the same name.  If so,
you
>> can either view each to see which is the bad one, or you could
delete
>> them both and re-index.
>>
>> B--
>>
>> >>> On 2/10/2012 at 11:41 AM, in message
>>
<CAGD4+EtT=GRH+H3QX-EJEzmVh=Z=s960fkfyffujlykgaig...@mail.gmail.com>,
>> Paul Go
>> <[email protected]> wrote:
>> > Sad to say, this did not solve the problem.  We ran filter-media,
>> cleanup
>> > and index-init jobs.
>> >
>> > I have not looked into running DSIndexer directly yet but that is
my
>> next
>> > option.
>> >
>> > Paul Go
>> >
>> > Systems Librarian /
>> > Library Technology Manager
>> > Paul V. Galvin Library
>> > 35 West 33rd Street
>> > Chicago, IL  60616
>> > 312.567.7997
>> > [email protected] 
>> >
>> >
>> >
>> > On Fri, Feb 10, 2012 at 9:29 AM, Thornton, Susan M.
>> (LARC-B702)[LITES] <
>> > [email protected]> wrote:
>> >
>> >> Please let us know if this solves your problem as I’m really
>> curious.****
>> >>
>> >> Thanks!****
>> >>
>> >> Sue****
>> >>
>> >> ** **
>> >>
>> >> ** **
>> >>
>> >> *Sue Walker-Thornton*
>> >>
>> >> *(w):  (757) 864-2368*
>> >>
>> >> *(m):  (757) 506-9903*
>> >>
>> >> ** **
>> >>
>> >> *From:* Paul Go [mailto:[email protected]] 
>> >> *Sent:* Friday, February 10, 2012 9:11 AM
>> >> *To:* Thornton, Susan M. (LARC-B702)[LITES]
>> >> *Cc:* Dspace Tech list
>> >> *Subject:* Re: [Dspace-tech] Full text search reindexing****
>> >>
>> >> ** **
>> >>
>> >> I ran index-init.  I will try the cleanup script as well.  Thank
>> you.
>> >>
>> >> Paul Go
>> >>
>> >> Systems Librarian /
>> >> Library Technology Manager
>> >> Paul V. Galvin Library
>> >> 35 West 33rd Street
>> >> Chicago, IL  60616
>> >> 312.567.7997
>> >> [email protected] 
>> >>
>> >>
>> >> ****
>> >>
>> >> On Thu, Feb 9, 2012 at 4:36 PM, Thornton, Susan M.
>> (LARC-B702)[LITES] <
>> >> [email protected]> wrote:****
>> >>
>> >> I’m not 100% sure, but try running the “cleanup” script. 
This
>> removes
>> >> bitstreams where bitstream.deleted = true.****
>> >>
>> >>  ****
>> >>
>> >> Also, did you run “index-update” or “index-init” to
rebuild
>> your indices?
>> >> I always run index-init after I’ve done something like that,
just
>> to be
>> >> safe.****
>> >>
>> >>  ****
>> >>
>> >> Best regards,****
>> >>
>> >> Sue****
>> >>
>> >>  ****
>> >>
>> >>  ****
>> >>
>> >> *Sue Walker-Thornton*****
>> >>
>> >> *(w):  (757) 864-2368*****
>> >>
>> >> *(m):  (757) 506-9903*****
>> >>
>> >>  ****
>> >>
>> >> *From:* Paul Go [mailto:[email protected]] 
>> >> *Sent:* Thursday, February 09, 2012 4:24 PM
>> >> *To:* DSpace General Mailing List; Dspace Tech list
>> >> *Subject:* [Dspace-tech] Full text search reindexing****
>> >>
>> >>  ****
>> >>
>> >>  ****
>> >>
>> >> We have redacted some information from PDFs that are in
 our
DSpace
>> >> instance.  This involved downloading, redacting, and
re-ingesting
>> the
>> >> files, making sure the originals with the offending information
>> were
>> >> removed.  We've done a full reindexing (with Tomcat off) but the
>> redacted
>> >> material is still showing up in a full-text search (even though
the
>> target
>> >> items no longer have the information).****
>> >>
>> >>  ****
>> >>
>> >> How can we force re-index the full-text search?  It was my
>> understanding
>> >> that reindexing would do the trick.****
>> >>
>> >>  ****
>> >>
>> >>
>> >>
>> >> Paul Go
>> >>
>> >> Systems Librarian /
>> >> Library Technology Manager
>> >> Paul V. Galvin Library
>> >> 35 West 33rd Street
>> >> Chicago, IL  60616
>> >> 312.567.7997
>> >> [email protected]**** 
>> >>
>> >> ** **
>> >>
>>

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to