Hi, Here is the PR https://github.com/DSpace/DSpace/pull/9653. I hope it will solve your problems.
Best regards, Damian Józefowski pon., 17 cze 2024 o 11:22 SAI KUMAR S <[email protected]> napisał(a): > Hi, > > Thankyou for your email > > Could please share the information for getting the solution. > > Thanks & Regards > Sai Kumar S > > On Thu, 13 Jun 2024 at 18:47, SAI KUMAR S <[email protected]> > wrote: > >> Hi Damian Józefowski, >> >> Thankyou >> >> Can you please share that information which could help us on getting the >> solution to the issue. >> >> Regards >> Sai Kumar S >> >> On Thursday 13 June 2024 at 18:32:38 UTC+5:30 Damian Józefowski wrote: >> >>> Hi, >>> In one of our projects at PCG we have added an option to this script >>> which allows you to set a date from which the thumbnails will be generated. >>> The date compares item last modified value. >>> If you are interested, we can prepare a PR with this solution. >>> >>> Best regards >>> Damian Józefowski >>> >>> śr., 12 cze 2024, 17:24 użytkownik DSpace Community < >>> [email protected]> napisał: >>> >>>> Hi Sai & Daan, >>>> >>>> The filter-media script always loops through all objects to *determine* >>>> which ones need to be processed. This script is in charge of *not only* >>>> thumbnails, but also for extracting text for indexing purposes (and any >>>> other actions that are enabled as "filter.plugins" in your dspace.cfg). >>>> See the full docs at >>>> https://wiki.lyrasis.org/display/DSDOC7x/Mediafilters+for+Transforming+DSpace+Content >>>> >>>> So, this script doesn't keep a list of objects which have already had >>>> generated thumbnails. The reason is that, even if a file has a generated >>>> thumbnail, it's possible the file needs to be processed by other filters >>>> (e.g. for full text indexing the textual content may be extracted). So, >>>> every time you run "filter-media" it will loop through every file...but >>>> will skip any files that it notices were already processed (e.g. if the >>>> file already has a thumbnail or extracted text, it will not re-generate it >>>> unless you use the "-f" flag to force regeneration). >>>> >>>> The "skip mode" (-s flag) concept can also be used to tell it to skip >>>> entire communities/collections/items...but then it will never process that >>>> object again until it is removed from the skip list. So, this should be >>>> used sparingly unless you are sure the object never will need a new >>>> thumbnail or full text indexing, etc. >>>> >>>> There are options to process files little by little (using the "-m" or >>>> maximum flag) or even process files community-by-community or >>>> collection-by-collection (using the "-i" or identifier flag) in order to >>>> break down a larger job into smaller chunks. >>>> >>>> This is simply how this tool works at this time. I do agree there may >>>> be ways to make it more efficient. But, we haven't had a developer >>>> volunteer to do such work or to redesign the current process. If you or >>>> anyone else out in the community are interested in helping to improve this >>>> tool, I'm sure the Committers would welcome ideas. All code in DSpace is >>>> built/support by volunteers and users. We don't have a centralized >>>> development team (i.e. I have no developers working for me). >>>> >>>> Semi-related this this, there have been past discussions about >>>> migrating all media filter scripts/tools into curation tasks (which would >>>> allow these processes to be run one-by-one as each new submission is added >>>> to DSpace, instead of via the current bulk processing script). There's >>>> some older tickets/PRs related to that, but it has never been finished / >>>> found to be fully working. See >>>> https://github.com/DSpace/DSpace/issues/6398 and >>>> https://github.com/DSpace/DSpace/pull/1674 (That said, I'd love to >>>> see this work completed at some point.) >>>> >>>> Tim >>>> >>>> >>>> >>>> On Tuesday, June 11, 2024 at 8:58:49 AM UTC-5 [email protected] >>>> wrote: >>>> >>>>> Hi Daan, >>>>> >>>>> Thankyou for your reply >>>>> >>>>> As you said if I have to restore an entire database and the >>>>> assetstore, it depends whether the thumbnail have been generated before >>>>> taking the backup, or if thumbnail were generated then no need to >>>>> regenerate the thumbnail from the scratch(I may not be correct, if any >>>>> information I have given is wrong please correct). >>>>> >>>>> As I wanted to know is that when I keep for generating thumbnail, why >>>>> it starts from scratch(but the generated thumbnail gets skipped anyways) >>>>> >>>>> I thought is there any other method where already generated thumbnails >>>>> does not get read and only generates the required(means which does not >>>>> have >>>>> thumbnails) >>>>> >>>>> Regards >>>>> Sai Kumar S >>>>> >>>>> On Tue, 11 Jun 2024, 11:06 am Daan Lessing, <[email protected]> >>>>> wrote: >>>>> >>>>>> Good morning, >>>>>> >>>>>> Just a follow-up question on this. Let's say for instance you have to >>>>>> restore an entire database and the assetstore, do you lose all thumbnails >>>>>> and will filter-media have to start building thumbnails from scratch? >>>>>> >>>>>> I have been running filter-media and it has been running for 3 weeks >>>>>> and not yet completed. >>>>>> >>>>>> Looking forward to your response. >>>>>> >>>>>> Kind regards, >>>>>> Daan >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> [image: Mailtrack] >>>>>> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11&> >>>>>> Sender >>>>>> notified by >>>>>> Mailtrack >>>>>> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11&> >>>>>> 11/06/24, >>>>>> 07:32:08 >>>>>> >>>>>> On Tue, Jun 11, 2024 at 6:28 AM SAI KUMAR S <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi Tim >>>>>>> >>>>>>> Thank you for the information. >>>>>>> >>>>>>> The issue is that when we run the command line *./dspace >>>>>>> filter-media*, the thumbnail-generated files are also read, but >>>>>>> they are skipped. This means the process reads the files from the >>>>>>> beginning >>>>>>> each time, which takes more time as the number of files increases. >>>>>>> >>>>>>> Is there any other method, such as executing a script, for >>>>>>> generating thumbnails more efficiently? >>>>>>> Regards >>>>>>> Sai Kumar S >>>>>>> >>>>>>> On Tuesday 11 June 2024 at 02:37:15 UTC+5:30 DSpace Community wrote: >>>>>>> >>>>>>>> Hi Sai, >>>>>>>> >>>>>>>> If you run "filter-media" **without** the "-f" flag, then it should >>>>>>>> automatically skip all Items that already have generated thumbnails. >>>>>>>> For >>>>>>>> example: >>>>>>>> >>>>>>>> ./dspace filter-media >>>>>>>> >>>>>>>> When you run it **with** the "-f" flag, that tells the filter-media >>>>>>>> script to **regenerate all thumbnails**. >>>>>>>> >>>>>>>> For more information see the documentation on this script >>>>>>>> <https://wiki.lyrasis.org/display/DSDOC7x/Mediafilters+for+Transforming+DSpace+Content#MediafiltersforTransformingDSpaceContent-Executing(viaCommandLine)> >>>>>>>> . >>>>>>>> >>>>>>>> (The "skip list" is only needed if you have files which are >>>>>>>> consistently throwing errors and you want to *skip them from all future >>>>>>>> runs* of the "filter-media" script. But, it shouldn't be necessary in >>>>>>>> your >>>>>>>> use case.) >>>>>>>> >>>>>>>> Tim >>>>>>>> >>>>>>>> On Monday, June 10, 2024 at 5:09:33 AM UTC-5 [email protected] >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi All, >>>>>>>>> >>>>>>>>> I have a query regarding filter-media. I have uploaded around 1000 >>>>>>>>> books to a collection and generated thumbnails for the PDF files >>>>>>>>> using the >>>>>>>>> command line *dspace filter-media -f.* >>>>>>>>> >>>>>>>>> However, when I upload another 1000 files to the same collection, >>>>>>>>> I need to generate thumbnails only for the newly uploaded files. I >>>>>>>>> tried >>>>>>>>> using the skip mode by creating a *skip-list.txt*, but I am not >>>>>>>>> getting the desired result. >>>>>>>>> >>>>>>>>> Could anyone of you provide me an example of how to correctly use >>>>>>>>> the skip-list.txt method to generate thumbnails? >>>>>>>>> >>>>>>>>> Alternatively, is there any other method, such as using a script >>>>>>>>> (e.g., Python), to generate the thumbnails for only the newly uploaded >>>>>>>>> files? >>>>>>>>> >>>>>>>>> Please help me solve this query. >>>>>>>>> >>>>>>>>> Thanks & Regards >>>>>>>>> Sai Kumar S >>>>>>>>> >>>>>>>>> -- >>>>>>> All messages to this mailing list should adhere to the Code of >>>>>>> Conduct: https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx >>>>>>> --- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "DSpace Community" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/dspace-community/07d120cd-74de-4420-b49d-d3ee6744738an%40googlegroups.com >>>>>>> <https://groups.google.com/d/msgid/dspace-community/07d120cd-74de-4420-b49d-d3ee6744738an%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>> -- >>>> All messages to this mailing list should adhere to the Code of Conduct: >>>> https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "DSpace Community" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> >>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/dspace-community/6190fd47-32e8-4f7f-a1d5-3c9744dce5ean%40googlegroups.com >>>> <https://groups.google.com/d/msgid/dspace-community/6190fd47-32e8-4f7f-a1d5-3c9744dce5ean%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- >> All messages to this mailing list should adhere to the Code of Conduct: >> https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx >> --- >> You received this message because you are subscribed to the Google Groups >> "DSpace Community" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/dspace-community/ab9c083c-402b-4895-b115-5bffcefc6e52n%40googlegroups.com >> <https://groups.google.com/d/msgid/dspace-community/ab9c083c-402b-4895-b115-5bffcefc6e52n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- All messages to this mailing list should adhere to the Code of Conduct: https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx --- You received this message because you are subscribed to the Google Groups "DSpace Community" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/dspace-community/CAGkqesYQ6NTRakMY%2BkSkX%3DxLD6a8uYa0FE9z1g%3Dxx7hk6vCo4A%40mail.gmail.com.
