Responding to Griffith

Thank you very much for your email and suggestions. My immediate plan is to
work with viral genomes, which are much simpler and smaller, and I guess it
is possible to put those genomes in a MongoDB.
At the moment I am still in the dark about what to do and how to implement
some ideas using atomspace. Currently, I am doing some reading on
atomspace. I will keep you posted about my progress and seek help if I may.

Responding to Linas

I am planning to read about atomspace  and to execute some of the examples
that came with the package. Python would be an easier choice for me.
While I was trying to compile atomspace with python bindings, I have got
the following error:

[ 97%] Built target utilities_cython
make[2]: *** No rule to make target
'../opencog/persist/api/cython/../../storage/storage_types.pyx', needed by
'opencog/persist/api/cython/storage.cpp'.  Stop.

Please let me know the potential solutions for this error.

Kind regards,

Abu


On Sat, 11 Jan 2025 at 05:26, Linas Vepstas <[email protected]> wrote:

> Replying to Abu.
>
> On Wed, Jan 8, 2025 at 12:34 PM Abu Naser <[email protected]> wrote:
> >
> > Good to hear from you.
> > I have done some googling about the LLM, I have found many people are
> using LLM for analysing genomic data.
>
> I'd be amazed if there weren't. Pharma is a $1.6 trillion-dollar
> business in the US alone.
> https://www.statista.com/topics/1764/global-pharmaceutical-industry/
> If some of that money *wasn't* going into LLM's, I would conclude that
> I had died and been reanimated in a crappy universe simulation.
>
> > (https://github.com/MAGICS-LAB/DNABERT_2?tab=readme-ov-file that can
> easily be used via  https://huggingface.co/docs/transformers/en/index)
> > Their approach is usual, 1st train a model and then use it to predict.
> In our case, where do we get the knowledge to store on atomspace?
>
> That's a great question. (If I understand you correctly) I assume you
> already know how to get, have access to oodles and poodles of genomic
> data. There are open, public databases of genomic data, in all shapes
> and sizes. No doubt there's even more that's proprietary, say, the
> 23+me dataset.
>
> I think the issue is "how do I hook up an LLM to the AtomSpace?" and
> the short answer is "I don't know". Well, I do know, but I am unhappy
> with all the ways I know how. So I've recently and with some urgency
> started to think about "what is the *best* way to hook up LLMs to the
> atomspace?" and I don't have an answer to that, yet. Might take a
> while
>
> > I can certainly to do some reading on their work and figure out how they
> do it.
>
> Yes, please!  If you can then explain it to me, in email, that would
> be excellent.  If you can't explain it, then some paper references...
>
> > Do you have the pattern matching tool set in github?
>
> Yes. https://github.com/opencog/learn
>
> Terminology: in comp-sci, "pattern matching" usually refers to a very
> simple kind of matching, called "regular expressions" (regex), with
> theory developed in 1960's and a standard part of Unix by the 1980's
> see e.g. "perl regex".
>
> Besides regex, many programming languages have a similar but different
> idea: scheme has "hygenic macros". as do other functional languages.
> Python does not    javascript does not. I think some of the latest and
> weirdest c++ standards track is trying to go that way. C++ templates
> are kind-of pattern-matcher-like-ish, but they're simple, and 30-35
> years old, now.
>
> In atomese, I made the mistake of calling it's graph rewriting system
> "pattern matching". Bad mistake, because it makes people think of the
> above rather simple systems. In fact, Atomese has 2 or 3 or 4 distinct
> systems that, uhh, "process patterns"
>
> At the bottom end, its the "query engine", which is a sophisticated
> and fast graph rewrite engine. Tutorials here:
> https://github.com/opencog/atomspace/tree/master/examples/pattern-matcher
>  you might find these to be .. mind-bendingly complicated. A theory
> paper is here:
> https://github.com/opencog/atomspace/raw/master/opencog/sheaf/docs/ram-cpu.pdf
>
> At the mid-range, there's a rule system and a unifier. The unifier
> works. The rule system needs to be torched and rewritten.
>
> At the "high-end", there's https://github.com/opencog/learn In many
> ways, it kind-of-ish resembles transformers. Except that it works with
> structures, rather than linear strings of data. And that kind-of
> changes everything. It gets kind-of-ish similar results, but since its
> also kind-of-ish completely different (because instead of working with
> strings, it works with trees) its ... well, its a weird-ass
> half-finished prototype. I love/hate it because I know why its great
> and why it's utterly mis-designed. Its a steep hill to climb.
>
> > I am a command line person. I would not mind even if it is a bit messy.
> I am a biologist by training but
> > professionally I don't do biology. It would be fun for me to do some
> biology on the sideline of my profession.
>
> Ah! Well, let's start small. Look at and plan what is doable and
> interesting and fun.
>
> > My shortcoming is that I am not a good coder.
>
> Heh. I'm a *very good coder*, and so when I say "this shit is
> difficult", trust me. This shit is difficult.
>
> (yes, that's an "appeal to authority", but .. hey.)
>
> --linas
>
> --
> You received this message because you are subscribed to the Google Groups
> "opencog" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion visit
> https://groups.google.com/d/msgid/opencog/CAHrUA37Be-ak%3DvBrc7%2B4QXB6zYWOfGCB1BuSkxb0VFfh6N%2BNKw%40mail.gmail.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/opencog/CAMw3wdi2YGioOgDSiNf75pm5HY3pyUfuoFqX4pSSSEMzuj9mKQ%40mail.gmail.com.

Reply via email to