from March via X (my X feed content is restabilizing after the change)

https://arxiv.org/abs/2403.13187

https://github.com/SakanaAI/evolutionary-model-merge
<https://github.com/SakanaAI/evolutionary-model-merge?tab=readme-ov-file>

The Open LLM Leaderboard is now dominated by merged models, showcasing its
potential for democratizing foundation model development.
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard


   1.

   Automated Model Composition: We introduce Evolutionary Model Merge, a
   general evolutionary method to automatically discover optimal combinations
   of diverse open-source models for creating new foundation models with
   user-specified capabilities. This approach harnesses the collective
   intelligence of existing open models, enabling the creation of powerful
   models without the need for extensive training data or compute.
   2.

   Cross-Domain Merging: We demonstrate that our method can discover novel
   ways to merge models from disparate domains (e.g., non-English language and
   Math, non-English language and Vision), potentially exceeding the
   capabilities achievable through conventional human design strategies.
   3.

   State-of-the-Art Performance: We showcase the effectiveness of our
   method by automatically generating a Japanese LLM with Math reasoning
   capability and a Japanese Vision-Language Model (VLM). Notably, both models
   achieve state-of-the-art performance on various benchmarks, even without
   explicit optimization for those tasks.
   4.

   High Efficiency and Surprising Generalizability: We observe that our 7B
   parameter LLM surpasses the performance of some previous 70B parameter
   Japanese LLMs on benchmark datasets, highlighting the high efficiency and
   surprising generalization capability of our approach. We believe this model
   can serve as a strong general-purpose Japanese LLM.
  • [ot][aiml] M... Undescribed Horrific Abuse, One Victim & Survivor of Many
    • Re: [ot... Undescribed Horrific Abuse, One Victim & Survivor of Many

Reply via email to