Hi,
I know there are a lot of apps for doing image descriptions. I've been working on one I call the Image Description Toolkit for close to a year now. I'm not trying to replace anything else but rather fill a need I had that I didn't find other products supporting. The main points to my app are supporting Ollama for local AI models and then being able to process large batches of images and save those descriptions. I have the first version of the apps working on a Mac if anyone is interesting in giving these a try. These require you use the silicon-based macs but even then most Ollama models will be far from instant. On an M1 Macbook Air I do get descriptions that are reasonable using a model called Moondream in about 6 seconds an image. If you have a Claude or OpenAI key, the app does allow you to use many of those models as well and process batches of images either from a cmd line tool or a graphical app. This is still under development and my first Mac app. It is created in Python and there is also a Windows version. That has been my main focus until recently. You can get a copy of the toolkit at http://www.theideaplace.net/projects/IDT-4.0.0Beta1Bld050.dmg. I wrote a blog post recently talking about the latest release. The Mac version wasn't quite ready but the functionality is the same. <https://theideaplace.net/introducing-idt-4-0-beta-1-an-enhanced-way-to-desc ribe-your-digital-images/> "Introducing IDT 4.0 Beta 1: An Enhanced Way to Describe Your Digital Images - The Idea Place You can learn more from my projects page at Software Projects Powered By My Ideas in Partnership with AI Coding <https://theideaplace.net/projects/> . This is still a beta and I've tried my best to ensure quality but I'm sure there are issues. I do link to an issue list on my projects page. If you do try this and use the Claude or OpenAI options with your own keys, absolutely try one or two images with the models and prompts you choose. I've done my best to test the models, my predefined prompts and the way I'm handling images to maximize description quality and minimize cost and token use but I have found the different models can definitely behave in various ways here. Kelly -- The following information is important for all members of the Mac Visionaries list. If you have any questions or concerns about the running of this list, or if you feel that a member's post is inappropriate, please contact the owners or moderators directly rather than posting on the list itself. Your Mac Visionaries list moderator is Mark Taylor. You can reach mark at: [email protected] and your owner is Cara Quinn - you can reach Cara at [email protected] The archives for this list can be searched at: http://www.mail-archive.com/[email protected]/ --- You received this message because you are subscribed to the Google Groups "MacVisionaries" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/macvisionaries/014801dca523%24ad91e390%2408b5aab0%24%40gmail.com.
