Dear  *Damian Józefowski* , 

*Thank you for addressing the problem; your assistance is greatly 
appreciated.*
  
I need to know why *thumbnails are not generating for larger file sizes*. 
What could be the issue? - how to resolve this issue.

Thanks & Regards
Sai Kumar S

On Monday 17 June 2024 at 19:02:58 UTC+5:30 Damian Józefowski wrote:

> Hi,
>
> Here is the PR https://github.com/DSpace/DSpace/pull/9653. I hope it will 
> solve your problems.
>
> Best regards,
> Damian Józefowski
>
> pon., 17 cze 2024 o 11:22 SAI KUMAR S <[email protected]> napisał(a):
>
>> Hi,
>>
>> Thankyou for your email
>>
>> Could please share the information for getting the solution.
>>
>
>> Thanks & Regards
>> Sai Kumar S
>>
> On Thu, 13 Jun 2024 at 18:47, SAI KUMAR S <[email protected]> wrote:
>>
> Hi Damian Józefowski,
>>>
>>> Thankyou 
>>>
>>> Can you please share that information which could help us on getting the 
>>> solution to the issue.
>>>
>>> Regards
>>> Sai Kumar S
>>>
>>> On Thursday 13 June 2024 at 18:32:38 UTC+5:30 Damian Józefowski wrote:
>>>
>>>> Hi, 
>>>> In one of our projects at PCG we have added an option to this script 
>>>> which allows you to set a date from which the thumbnails will be 
>>>> generated. 
>>>> The date compares item last modified value.
>>>> If you are interested, we can prepare a PR with this solution.
>>>>
>>>> Best regards
>>>> Damian Józefowski
>>>>
>>>> śr., 12 cze 2024, 17:24 użytkownik DSpace Community <
>>>> [email protected]> napisał:
>>>>
>>>>> Hi Sai & Daan,
>>>>>
>>>>> The filter-media script always loops through all objects to 
>>>>> *determine* which ones need to be processed.  This script is in charge of 
>>>>> *not only* thumbnails, but also for extracting text for indexing purposes 
>>>>> (and any other actions that are enabled as "filter.plugins" in your 
>>>>> dspace.cfg).  See the full docs at 
>>>>> https://wiki.lyrasis.org/display/DSDOC7x/Mediafilters+for+Transforming+DSpace+Content
>>>>>
>>>>> So, this script doesn't keep a list of objects which have already had 
>>>>> generated thumbnails.  The reason is that, even if a file has a generated 
>>>>> thumbnail, it's possible the file needs to be processed by other filters 
>>>>> (e.g. for full text indexing the textual content may be extracted).  So, 
>>>>> every time you run "filter-media" it will loop through every file...but 
>>>>> will skip any files that it notices were already processed (e.g. if the 
>>>>> file already has a thumbnail or extracted text, it will not re-generate 
>>>>> it 
>>>>> unless you use the "-f" flag to force regeneration).   
>>>>>
>>>>> The "skip mode" (-s flag) concept can also be used to tell it to skip 
>>>>> entire communities/collections/items...but then it will never process 
>>>>> that 
>>>>> object again until it is removed from the skip list.  So, this should be 
>>>>> used sparingly unless you are sure the object never will need a new 
>>>>> thumbnail or full text indexing, etc.
>>>>>
>>>>> There are options to process files little by little (using the "-m" or 
>>>>> maximum flag) or even process files community-by-community or 
>>>>> collection-by-collection (using the "-i" or identifier flag) in order to 
>>>>> break down a larger job into smaller chunks.
>>>>>
>>>>> This is simply how this tool works at this time.  I do agree there may 
>>>>> be ways to make it more efficient.  But, we haven't had a developer 
>>>>> volunteer to do such work or to redesign the current process.  If you or 
>>>>> anyone else out in the community are interested in helping to improve 
>>>>> this 
>>>>> tool, I'm sure the Committers would welcome ideas.  All code in DSpace is 
>>>>> built/support by volunteers and users. We don't have a centralized 
>>>>> development team (i.e. I have no developers working for me).
>>>>>
>>>>> Semi-related this this, there have been past discussions about 
>>>>> migrating all media filter scripts/tools into curation tasks (which would 
>>>>> allow these processes to be run one-by-one as each new submission is 
>>>>> added 
>>>>> to DSpace, instead of via the current bulk processing script).  There's 
>>>>> some older tickets/PRs related to that, but it has never been finished / 
>>>>> found to be fully working.  See 
>>>>> https://github.com/DSpace/DSpace/issues/6398 and 
>>>>> https://github.com/DSpace/DSpace/pull/1674   (That said, I'd love to 
>>>>> see this work completed at some point.)
>>>>>
>>>>> Tim
>>>>>
>>>>>
>>>>>
>>>>> On Tuesday, June 11, 2024 at 8:58:49 AM UTC-5 [email protected] 
>>>>> wrote:
>>>>>
>>>>>> Hi Daan,
>>>>>>
>>>>>> Thankyou for your reply
>>>>>>
>>>>>> As you said if I have to restore an entire database and the 
>>>>>> assetstore, it depends whether the thumbnail have been generated before 
>>>>>> taking the backup, or if thumbnail were generated then no need to 
>>>>>> regenerate the thumbnail from the scratch(I may not be correct, if any 
>>>>>> information I have given is wrong please correct).
>>>>>>
>>>>>> As I wanted to know is that when I keep for generating thumbnail, why 
>>>>>> it starts from scratch(but the generated thumbnail gets skipped anyways)
>>>>>>
>>>>>> I thought is there any other method where already generated 
>>>>>> thumbnails does not get read and only generates the required(means which 
>>>>>> does not have thumbnails)
>>>>>>
>>>>>> Regards 
>>>>>> Sai Kumar S 
>>>>>>
>>>>>> On Tue, 11 Jun 2024, 11:06 am Daan Lessing, <[email protected]> 
>>>>>> wrote:
>>>>>>
>>>>>>> Good morning,
>>>>>>>
>>>>>>> Just a follow-up question on this. Let's say for instance you have 
>>>>>>> to restore an entire database and the assetstore, do you lose all 
>>>>>>> thumbnails and will filter-media have to start building thumbnails from 
>>>>>>> scratch?
>>>>>>>
>>>>>>> I have been running filter-media and it has been running for 3 weeks 
>>>>>>> and not yet completed. 
>>>>>>>
>>>>>>> Looking forward to your response.
>>>>>>>
>>>>>>> Kind regards,
>>>>>>> Daan
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> [image: Mailtrack] 
>>>>>>> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11&;>
>>>>>>>  Sender 
>>>>>>> notified by 
>>>>>>> Mailtrack 
>>>>>>> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11&;>
>>>>>>>  11/06/24, 
>>>>>>> 07:32:08 
>>>>>>>
>>>>>>> On Tue, Jun 11, 2024 at 6:28 AM SAI KUMAR S <[email protected]> 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Tim
>>>>>>>>
>>>>>>>> Thank you for the information.
>>>>>>>>
>>>>>>>> The issue is that when we run the command line *./dspace 
>>>>>>>> filter-media*, the thumbnail-generated files are also read, but 
>>>>>>>> they are skipped. This means the process reads the files from the 
>>>>>>>> beginning 
>>>>>>>> each time, which takes more time as the number of files increases.
>>>>>>>>
>>>>>>>> Is there any other method, such as executing a script, for 
>>>>>>>> generating thumbnails more efficiently?
>>>>>>>> Regards
>>>>>>>> Sai Kumar S
>>>>>>>>
>>>>>>>> On Tuesday 11 June 2024 at 02:37:15 UTC+5:30 DSpace Community wrote:
>>>>>>>>
>>>>>>>>> Hi Sai,
>>>>>>>>>
>>>>>>>>> If you run "filter-media" **without** the "-f" flag, then it 
>>>>>>>>> should automatically skip all Items that already have generated 
>>>>>>>>> thumbnails.   For example:
>>>>>>>>>
>>>>>>>>> ./dspace filter-media
>>>>>>>>>
>>>>>>>>> When you run it **with** the "-f" flag, that tells the 
>>>>>>>>> filter-media script to **regenerate all thumbnails**.
>>>>>>>>>
>>>>>>>>> For more information see the documentation on this script 
>>>>>>>>> <https://wiki.lyrasis.org/display/DSDOC7x/Mediafilters+for+Transforming+DSpace+Content#MediafiltersforTransformingDSpaceContent-Executing(viaCommandLine)>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>> (The "skip list" is only needed if you have files which are 
>>>>>>>>> consistently throwing errors and you want to *skip them from all 
>>>>>>>>> future 
>>>>>>>>> runs* of the "filter-media" script.  But, it shouldn't be necessary 
>>>>>>>>> in your 
>>>>>>>>> use case.)
>>>>>>>>>
>>>>>>>>> Tim
>>>>>>>>>
>>>>>>>>> On Monday, June 10, 2024 at 5:09:33 AM UTC-5 
>>>>>>>>> [email protected] wrote:
>>>>>>>>>
>>>>>>>>>> Hi All,
>>>>>>>>>>
>>>>>>>>>> I have a query regarding filter-media. I have uploaded around 
>>>>>>>>>> 1000 books to a collection and generated thumbnails for the PDF 
>>>>>>>>>> files using 
>>>>>>>>>> the command line *dspace filter-media -f.*
>>>>>>>>>>
>>>>>>>>>> However, when I upload another 1000 files to the same collection, 
>>>>>>>>>> I need to generate thumbnails only for the newly uploaded files. I 
>>>>>>>>>> tried 
>>>>>>>>>> using the skip mode by creating a *skip-list.txt*, but I am not 
>>>>>>>>>> getting the desired result.
>>>>>>>>>>
>>>>>>>>>> Could anyone of you provide me an example of how to correctly use 
>>>>>>>>>> the skip-list.txt method to generate thumbnails?
>>>>>>>>>>
>>>>>>>>>> Alternatively, is there any other method, such as using a script 
>>>>>>>>>> (e.g., Python), to generate the thumbnails for only the newly 
>>>>>>>>>> uploaded 
>>>>>>>>>> files?
>>>>>>>>>>
>>>>>>>>>> Please help me solve this query.
>>>>>>>>>>
>>>>>>>>>> Thanks & Regards
>>>>>>>>>> Sai Kumar S
>>>>>>>>>>
>>>>>>>>>> -- 
>>>>>>>> All messages to this mailing list should adhere to the Code of 
>>>>>>>> Conduct: https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
>>>>>>>> --- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "DSpace Community" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>> send an email to [email protected].
>>>>>>>> To view this discussion on the web visit 
>>>>>>>> https://groups.google.com/d/msgid/dspace-community/07d120cd-74de-4420-b49d-d3ee6744738an%40googlegroups.com
>>>>>>>>  
>>>>>>>> <https://groups.google.com/d/msgid/dspace-community/07d120cd-74de-4420-b49d-d3ee6744738an%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>> -- 
>>>>> All messages to this mailing list should adhere to the Code of 
>>>>> Conduct: https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
>>>>> --- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "DSpace Community" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected].
>>>>>
>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/dspace-community/6190fd47-32e8-4f7f-a1d5-3c9744dce5ean%40googlegroups.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/dspace-community/6190fd47-32e8-4f7f-a1d5-3c9744dce5ean%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> -- 
>>> All messages to this mailing list should adhere to the Code of Conduct: 
>>> https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "DSpace Community" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>>
>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/dspace-community/ab9c083c-402b-4895-b115-5bffcefc6e52n%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/dspace-community/ab9c083c-402b-4895-b115-5bffcefc6e52n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>

-- 
All messages to this mailing list should adhere to the Code of Conduct: 
https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-community/4eb45fbf-dd11-49ff-9fe0-30cd3e604de4n%40googlegroups.com.

Reply via email to