Dear *Damian Józefowski* , *Thank you for addressing the problem; your assistance is greatly appreciated.* I need to know why *thumbnails are not generating for larger file sizes*. What could be the issue? - how to resolve this issue.
Thanks & Regards Sai Kumar S On Monday 17 June 2024 at 19:02:58 UTC+5:30 Damian Józefowski wrote: > Hi, > > Here is the PR https://github.com/DSpace/DSpace/pull/9653. I hope it will > solve your problems. > > Best regards, > Damian Józefowski > > pon., 17 cze 2024 o 11:22 SAI KUMAR S <[email protected]> napisał(a): > >> Hi, >> >> Thankyou for your email >> >> Could please share the information for getting the solution. >> > >> Thanks & Regards >> Sai Kumar S >> > On Thu, 13 Jun 2024 at 18:47, SAI KUMAR S <[email protected]> wrote: >> > Hi Damian Józefowski, >>> >>> Thankyou >>> >>> Can you please share that information which could help us on getting the >>> solution to the issue. >>> >>> Regards >>> Sai Kumar S >>> >>> On Thursday 13 June 2024 at 18:32:38 UTC+5:30 Damian Józefowski wrote: >>> >>>> Hi, >>>> In one of our projects at PCG we have added an option to this script >>>> which allows you to set a date from which the thumbnails will be >>>> generated. >>>> The date compares item last modified value. >>>> If you are interested, we can prepare a PR with this solution. >>>> >>>> Best regards >>>> Damian Józefowski >>>> >>>> śr., 12 cze 2024, 17:24 użytkownik DSpace Community < >>>> [email protected]> napisał: >>>> >>>>> Hi Sai & Daan, >>>>> >>>>> The filter-media script always loops through all objects to >>>>> *determine* which ones need to be processed. This script is in charge of >>>>> *not only* thumbnails, but also for extracting text for indexing purposes >>>>> (and any other actions that are enabled as "filter.plugins" in your >>>>> dspace.cfg). See the full docs at >>>>> https://wiki.lyrasis.org/display/DSDOC7x/Mediafilters+for+Transforming+DSpace+Content >>>>> >>>>> So, this script doesn't keep a list of objects which have already had >>>>> generated thumbnails. The reason is that, even if a file has a generated >>>>> thumbnail, it's possible the file needs to be processed by other filters >>>>> (e.g. for full text indexing the textual content may be extracted). So, >>>>> every time you run "filter-media" it will loop through every file...but >>>>> will skip any files that it notices were already processed (e.g. if the >>>>> file already has a thumbnail or extracted text, it will not re-generate >>>>> it >>>>> unless you use the "-f" flag to force regeneration). >>>>> >>>>> The "skip mode" (-s flag) concept can also be used to tell it to skip >>>>> entire communities/collections/items...but then it will never process >>>>> that >>>>> object again until it is removed from the skip list. So, this should be >>>>> used sparingly unless you are sure the object never will need a new >>>>> thumbnail or full text indexing, etc. >>>>> >>>>> There are options to process files little by little (using the "-m" or >>>>> maximum flag) or even process files community-by-community or >>>>> collection-by-collection (using the "-i" or identifier flag) in order to >>>>> break down a larger job into smaller chunks. >>>>> >>>>> This is simply how this tool works at this time. I do agree there may >>>>> be ways to make it more efficient. But, we haven't had a developer >>>>> volunteer to do such work or to redesign the current process. If you or >>>>> anyone else out in the community are interested in helping to improve >>>>> this >>>>> tool, I'm sure the Committers would welcome ideas. All code in DSpace is >>>>> built/support by volunteers and users. We don't have a centralized >>>>> development team (i.e. I have no developers working for me). >>>>> >>>>> Semi-related this this, there have been past discussions about >>>>> migrating all media filter scripts/tools into curation tasks (which would >>>>> allow these processes to be run one-by-one as each new submission is >>>>> added >>>>> to DSpace, instead of via the current bulk processing script). There's >>>>> some older tickets/PRs related to that, but it has never been finished / >>>>> found to be fully working. See >>>>> https://github.com/DSpace/DSpace/issues/6398 and >>>>> https://github.com/DSpace/DSpace/pull/1674 (That said, I'd love to >>>>> see this work completed at some point.) >>>>> >>>>> Tim >>>>> >>>>> >>>>> >>>>> On Tuesday, June 11, 2024 at 8:58:49 AM UTC-5 [email protected] >>>>> wrote: >>>>> >>>>>> Hi Daan, >>>>>> >>>>>> Thankyou for your reply >>>>>> >>>>>> As you said if I have to restore an entire database and the >>>>>> assetstore, it depends whether the thumbnail have been generated before >>>>>> taking the backup, or if thumbnail were generated then no need to >>>>>> regenerate the thumbnail from the scratch(I may not be correct, if any >>>>>> information I have given is wrong please correct). >>>>>> >>>>>> As I wanted to know is that when I keep for generating thumbnail, why >>>>>> it starts from scratch(but the generated thumbnail gets skipped anyways) >>>>>> >>>>>> I thought is there any other method where already generated >>>>>> thumbnails does not get read and only generates the required(means which >>>>>> does not have thumbnails) >>>>>> >>>>>> Regards >>>>>> Sai Kumar S >>>>>> >>>>>> On Tue, 11 Jun 2024, 11:06 am Daan Lessing, <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Good morning, >>>>>>> >>>>>>> Just a follow-up question on this. Let's say for instance you have >>>>>>> to restore an entire database and the assetstore, do you lose all >>>>>>> thumbnails and will filter-media have to start building thumbnails from >>>>>>> scratch? >>>>>>> >>>>>>> I have been running filter-media and it has been running for 3 weeks >>>>>>> and not yet completed. >>>>>>> >>>>>>> Looking forward to your response. >>>>>>> >>>>>>> Kind regards, >>>>>>> Daan >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> [image: Mailtrack] >>>>>>> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11&> >>>>>>> Sender >>>>>>> notified by >>>>>>> Mailtrack >>>>>>> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11&> >>>>>>> 11/06/24, >>>>>>> 07:32:08 >>>>>>> >>>>>>> On Tue, Jun 11, 2024 at 6:28 AM SAI KUMAR S <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Tim >>>>>>>> >>>>>>>> Thank you for the information. >>>>>>>> >>>>>>>> The issue is that when we run the command line *./dspace >>>>>>>> filter-media*, the thumbnail-generated files are also read, but >>>>>>>> they are skipped. This means the process reads the files from the >>>>>>>> beginning >>>>>>>> each time, which takes more time as the number of files increases. >>>>>>>> >>>>>>>> Is there any other method, such as executing a script, for >>>>>>>> generating thumbnails more efficiently? >>>>>>>> Regards >>>>>>>> Sai Kumar S >>>>>>>> >>>>>>>> On Tuesday 11 June 2024 at 02:37:15 UTC+5:30 DSpace Community wrote: >>>>>>>> >>>>>>>>> Hi Sai, >>>>>>>>> >>>>>>>>> If you run "filter-media" **without** the "-f" flag, then it >>>>>>>>> should automatically skip all Items that already have generated >>>>>>>>> thumbnails. For example: >>>>>>>>> >>>>>>>>> ./dspace filter-media >>>>>>>>> >>>>>>>>> When you run it **with** the "-f" flag, that tells the >>>>>>>>> filter-media script to **regenerate all thumbnails**. >>>>>>>>> >>>>>>>>> For more information see the documentation on this script >>>>>>>>> <https://wiki.lyrasis.org/display/DSDOC7x/Mediafilters+for+Transforming+DSpace+Content#MediafiltersforTransformingDSpaceContent-Executing(viaCommandLine)> >>>>>>>>> . >>>>>>>>> >>>>>>>>> (The "skip list" is only needed if you have files which are >>>>>>>>> consistently throwing errors and you want to *skip them from all >>>>>>>>> future >>>>>>>>> runs* of the "filter-media" script. But, it shouldn't be necessary >>>>>>>>> in your >>>>>>>>> use case.) >>>>>>>>> >>>>>>>>> Tim >>>>>>>>> >>>>>>>>> On Monday, June 10, 2024 at 5:09:33 AM UTC-5 >>>>>>>>> [email protected] wrote: >>>>>>>>> >>>>>>>>>> Hi All, >>>>>>>>>> >>>>>>>>>> I have a query regarding filter-media. I have uploaded around >>>>>>>>>> 1000 books to a collection and generated thumbnails for the PDF >>>>>>>>>> files using >>>>>>>>>> the command line *dspace filter-media -f.* >>>>>>>>>> >>>>>>>>>> However, when I upload another 1000 files to the same collection, >>>>>>>>>> I need to generate thumbnails only for the newly uploaded files. I >>>>>>>>>> tried >>>>>>>>>> using the skip mode by creating a *skip-list.txt*, but I am not >>>>>>>>>> getting the desired result. >>>>>>>>>> >>>>>>>>>> Could anyone of you provide me an example of how to correctly use >>>>>>>>>> the skip-list.txt method to generate thumbnails? >>>>>>>>>> >>>>>>>>>> Alternatively, is there any other method, such as using a script >>>>>>>>>> (e.g., Python), to generate the thumbnails for only the newly >>>>>>>>>> uploaded >>>>>>>>>> files? >>>>>>>>>> >>>>>>>>>> Please help me solve this query. >>>>>>>>>> >>>>>>>>>> Thanks & Regards >>>>>>>>>> Sai Kumar S >>>>>>>>>> >>>>>>>>>> -- >>>>>>>> All messages to this mailing list should adhere to the Code of >>>>>>>> Conduct: https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx >>>>>>>> --- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "DSpace Community" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/dspace-community/07d120cd-74de-4420-b49d-d3ee6744738an%40googlegroups.com >>>>>>>> >>>>>>>> <https://groups.google.com/d/msgid/dspace-community/07d120cd-74de-4420-b49d-d3ee6744738an%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>> -- >>>>> All messages to this mailing list should adhere to the Code of >>>>> Conduct: https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx >>>>> --- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "DSpace Community" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> >>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/dspace-community/6190fd47-32e8-4f7f-a1d5-3c9744dce5ean%40googlegroups.com >>>>> >>>>> <https://groups.google.com/d/msgid/dspace-community/6190fd47-32e8-4f7f-a1d5-3c9744dce5ean%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> -- >>> All messages to this mailing list should adhere to the Code of Conduct: >>> https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "DSpace Community" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> >> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/dspace-community/ab9c083c-402b-4895-b115-5bffcefc6e52n%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/dspace-community/ab9c083c-402b-4895-b115-5bffcefc6e52n%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- All messages to this mailing list should adhere to the Code of Conduct: https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx --- You received this message because you are subscribed to the Google Groups "DSpace Community" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/dspace-community/4eb45fbf-dd11-49ff-9fe0-30cd3e604de4n%40googlegroups.com.
