Hi,

Here is the PR https://github.com/DSpace/DSpace/pull/9653. I hope it will
solve your problems.

Best regards,
Damian Józefowski

pon., 17 cze 2024 o 11:22 SAI KUMAR S <[email protected]>
napisał(a):

> Hi,
>
> Thankyou for your email
>
> Could please share the information for getting the solution.
>
> Thanks & Regards
> Sai Kumar S
>
> On Thu, 13 Jun 2024 at 18:47, SAI KUMAR S <[email protected]>
> wrote:
>
>> Hi Damian Józefowski,
>>
>> Thankyou
>>
>> Can you please share that information which could help us on getting the
>> solution to the issue.
>>
>> Regards
>> Sai Kumar S
>>
>> On Thursday 13 June 2024 at 18:32:38 UTC+5:30 Damian Józefowski wrote:
>>
>>> Hi,
>>> In one of our projects at PCG we have added an option to this script
>>> which allows you to set a date from which the thumbnails will be generated.
>>> The date compares item last modified value.
>>> If you are interested, we can prepare a PR with this solution.
>>>
>>> Best regards
>>> Damian Józefowski
>>>
>>> śr., 12 cze 2024, 17:24 użytkownik DSpace Community <
>>> [email protected]> napisał:
>>>
>>>> Hi Sai & Daan,
>>>>
>>>> The filter-media script always loops through all objects to *determine*
>>>> which ones need to be processed.  This script is in charge of *not only*
>>>> thumbnails, but also for extracting text for indexing purposes (and any
>>>> other actions that are enabled as "filter.plugins" in your dspace.cfg).
>>>> See the full docs at
>>>> https://wiki.lyrasis.org/display/DSDOC7x/Mediafilters+for+Transforming+DSpace+Content
>>>>
>>>> So, this script doesn't keep a list of objects which have already had
>>>> generated thumbnails.  The reason is that, even if a file has a generated
>>>> thumbnail, it's possible the file needs to be processed by other filters
>>>> (e.g. for full text indexing the textual content may be extracted).  So,
>>>> every time you run "filter-media" it will loop through every file...but
>>>> will skip any files that it notices were already processed (e.g. if the
>>>> file already has a thumbnail or extracted text, it will not re-generate it
>>>> unless you use the "-f" flag to force regeneration).
>>>>
>>>> The "skip mode" (-s flag) concept can also be used to tell it to skip
>>>> entire communities/collections/items...but then it will never process that
>>>> object again until it is removed from the skip list.  So, this should be
>>>> used sparingly unless you are sure the object never will need a new
>>>> thumbnail or full text indexing, etc.
>>>>
>>>> There are options to process files little by little (using the "-m" or
>>>> maximum flag) or even process files community-by-community or
>>>> collection-by-collection (using the "-i" or identifier flag) in order to
>>>> break down a larger job into smaller chunks.
>>>>
>>>> This is simply how this tool works at this time.  I do agree there may
>>>> be ways to make it more efficient.  But, we haven't had a developer
>>>> volunteer to do such work or to redesign the current process.  If you or
>>>> anyone else out in the community are interested in helping to improve this
>>>> tool, I'm sure the Committers would welcome ideas.  All code in DSpace is
>>>> built/support by volunteers and users. We don't have a centralized
>>>> development team (i.e. I have no developers working for me).
>>>>
>>>> Semi-related this this, there have been past discussions about
>>>> migrating all media filter scripts/tools into curation tasks (which would
>>>> allow these processes to be run one-by-one as each new submission is added
>>>> to DSpace, instead of via the current bulk processing script).  There's
>>>> some older tickets/PRs related to that, but it has never been finished /
>>>> found to be fully working.  See
>>>> https://github.com/DSpace/DSpace/issues/6398 and
>>>> https://github.com/DSpace/DSpace/pull/1674   (That said, I'd love to
>>>> see this work completed at some point.)
>>>>
>>>> Tim
>>>>
>>>>
>>>>
>>>> On Tuesday, June 11, 2024 at 8:58:49 AM UTC-5 [email protected]
>>>> wrote:
>>>>
>>>>> Hi Daan,
>>>>>
>>>>> Thankyou for your reply
>>>>>
>>>>> As you said if I have to restore an entire database and the
>>>>> assetstore, it depends whether the thumbnail have been generated before
>>>>> taking the backup, or if thumbnail were generated then no need to
>>>>> regenerate the thumbnail from the scratch(I may not be correct, if any
>>>>> information I have given is wrong please correct).
>>>>>
>>>>> As I wanted to know is that when I keep for generating thumbnail, why
>>>>> it starts from scratch(but the generated thumbnail gets skipped anyways)
>>>>>
>>>>> I thought is there any other method where already generated thumbnails
>>>>> does not get read and only generates the required(means which does not 
>>>>> have
>>>>> thumbnails)
>>>>>
>>>>> Regards
>>>>> Sai Kumar S
>>>>>
>>>>> On Tue, 11 Jun 2024, 11:06 am Daan Lessing, <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Good morning,
>>>>>>
>>>>>> Just a follow-up question on this. Let's say for instance you have to
>>>>>> restore an entire database and the assetstore, do you lose all thumbnails
>>>>>> and will filter-media have to start building thumbnails from scratch?
>>>>>>
>>>>>> I have been running filter-media and it has been running for 3 weeks
>>>>>> and not yet completed.
>>>>>>
>>>>>> Looking forward to your response.
>>>>>>
>>>>>> Kind regards,
>>>>>> Daan
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> [image: Mailtrack]
>>>>>> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11&;>
>>>>>>  Sender
>>>>>> notified by
>>>>>> Mailtrack
>>>>>> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11&;>
>>>>>>  11/06/24,
>>>>>> 07:32:08
>>>>>>
>>>>>> On Tue, Jun 11, 2024 at 6:28 AM SAI KUMAR S <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Tim
>>>>>>>
>>>>>>> Thank you for the information.
>>>>>>>
>>>>>>> The issue is that when we run the command line *./dspace
>>>>>>> filter-media*, the thumbnail-generated files are also read, but
>>>>>>> they are skipped. This means the process reads the files from the 
>>>>>>> beginning
>>>>>>> each time, which takes more time as the number of files increases.
>>>>>>>
>>>>>>> Is there any other method, such as executing a script, for
>>>>>>> generating thumbnails more efficiently?
>>>>>>> Regards
>>>>>>> Sai Kumar S
>>>>>>>
>>>>>>> On Tuesday 11 June 2024 at 02:37:15 UTC+5:30 DSpace Community wrote:
>>>>>>>
>>>>>>>> Hi Sai,
>>>>>>>>
>>>>>>>> If you run "filter-media" **without** the "-f" flag, then it should
>>>>>>>> automatically skip all Items that already have generated thumbnails.   
>>>>>>>> For
>>>>>>>> example:
>>>>>>>>
>>>>>>>> ./dspace filter-media
>>>>>>>>
>>>>>>>> When you run it **with** the "-f" flag, that tells the filter-media
>>>>>>>> script to **regenerate all thumbnails**.
>>>>>>>>
>>>>>>>> For more information see the documentation on this script
>>>>>>>> <https://wiki.lyrasis.org/display/DSDOC7x/Mediafilters+for+Transforming+DSpace+Content#MediafiltersforTransformingDSpaceContent-Executing(viaCommandLine)>
>>>>>>>> .
>>>>>>>>
>>>>>>>> (The "skip list" is only needed if you have files which are
>>>>>>>> consistently throwing errors and you want to *skip them from all future
>>>>>>>> runs* of the "filter-media" script.  But, it shouldn't be necessary in 
>>>>>>>> your
>>>>>>>> use case.)
>>>>>>>>
>>>>>>>> Tim
>>>>>>>>
>>>>>>>> On Monday, June 10, 2024 at 5:09:33 AM UTC-5 [email protected]
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi All,
>>>>>>>>>
>>>>>>>>> I have a query regarding filter-media. I have uploaded around 1000
>>>>>>>>> books to a collection and generated thumbnails for the PDF files 
>>>>>>>>> using the
>>>>>>>>> command line *dspace filter-media -f.*
>>>>>>>>>
>>>>>>>>> However, when I upload another 1000 files to the same collection,
>>>>>>>>> I need to generate thumbnails only for the newly uploaded files. I 
>>>>>>>>> tried
>>>>>>>>> using the skip mode by creating a *skip-list.txt*, but I am not
>>>>>>>>> getting the desired result.
>>>>>>>>>
>>>>>>>>> Could anyone of you provide me an example of how to correctly use
>>>>>>>>> the skip-list.txt method to generate thumbnails?
>>>>>>>>>
>>>>>>>>> Alternatively, is there any other method, such as using a script
>>>>>>>>> (e.g., Python), to generate the thumbnails for only the newly uploaded
>>>>>>>>> files?
>>>>>>>>>
>>>>>>>>> Please help me solve this query.
>>>>>>>>>
>>>>>>>>> Thanks & Regards
>>>>>>>>> Sai Kumar S
>>>>>>>>>
>>>>>>>>> --
>>>>>>> All messages to this mailing list should adhere to the Code of
>>>>>>> Conduct: https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
>>>>>>> ---
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "DSpace Community" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to [email protected].
>>>>>>> To view this discussion on the web visit
>>>>>>> https://groups.google.com/d/msgid/dspace-community/07d120cd-74de-4420-b49d-d3ee6744738an%40googlegroups.com
>>>>>>> <https://groups.google.com/d/msgid/dspace-community/07d120cd-74de-4420-b49d-d3ee6744738an%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>> --
>>>> All messages to this mailing list should adhere to the Code of Conduct:
>>>> https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
>>>> ---
>>>> You received this message because you are subscribed to the Google
>>>> Groups "DSpace Community" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>>
>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/dspace-community/6190fd47-32e8-4f7f-a1d5-3c9744dce5ean%40googlegroups.com
>>>> <https://groups.google.com/d/msgid/dspace-community/6190fd47-32e8-4f7f-a1d5-3c9744dce5ean%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> --
>> All messages to this mailing list should adhere to the Code of Conduct:
>> https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "DSpace Community" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/dspace-community/ab9c083c-402b-4895-b115-5bffcefc6e52n%40googlegroups.com
>> <https://groups.google.com/d/msgid/dspace-community/ab9c083c-402b-4895-b115-5bffcefc6e52n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
All messages to this mailing list should adhere to the Code of Conduct: 
https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-community/CAGkqesYQ6NTRakMY%2BkSkX%3DxLD6a8uYa0FE9z1g%3Dxx7hk6vCo4A%40mail.gmail.com.

Reply via email to