| You’re most welcome Abu.
Quick question: What sources are you using to read up on AtomSpace? Just curious :)
Keep us all posted on how it’s going.
Cheers 🍻
-Griffith Sent from my iPhone Responding to Griffith
Thank you very much for your email and suggestions. My immediate plan is to work with viral genomes, which are much simpler and smaller, and I guess it is possible to put those genomes in a MongoDB. At the moment I am still in the dark about what to do and how to implement some ideas using atomspace. Currently, I am doing some reading on atomspace. I will keep you posted about my progress and seek help if I may.
Responding to Linas
I am planning to read about atomspace and to execute some of the examples that came with the package. Python would be an easier choice for me. While I was trying to compile atomspace with python bindings, I have got the following error:
[ 97%] Built target utilities_cython make[2]: *** No rule to make target '../opencog/persist/api/cython/../../storage/storage_types.pyx', needed by 'opencog/persist/api/cython/storage.cpp'. Stop.
Please let me know the potential solutions for this error.
Kind regards,
Abu
Replying to Abu.
On Wed, Jan 8, 2025 at 12:34 PM Abu Naser <[email protected]> wrote:
>
> Good to hear from you.
> I have done some googling about the LLM, I have found many people are using LLM for analysing genomic data.
I'd be amazed if there weren't. Pharma is a $1.6 trillion-dollar
business in the US alone.
https://www.statista.com/topics/1764/global-pharmaceutical-industry/
If some of that money *wasn't* going into LLM's, I would conclude that
I had died and been reanimated in a crappy universe simulation.
> (https://github.com/MAGICS-LAB/DNABERT_2?tab=readme-ov-file that can easily be used via https://huggingface.co/docs/transformers/en/index)
> Their approach is usual, 1st train a model and then use it to predict. In our case, where do we get the knowledge to store on atomspace?
That's a great question. (If I understand you correctly) I assume you
already know how to get, have access to oodles and poodles of genomic
data. There are open, public databases of genomic data, in all shapes
and sizes. No doubt there's even more that's proprietary, say, the
23+me dataset.
I think the issue is "how do I hook up an LLM to the AtomSpace?" and
the short answer is "I don't know". Well, I do know, but I am unhappy
with all the ways I know how. So I've recently and with some urgency
started to think about "what is the *best* way to hook up LLMs to the
atomspace?" and I don't have an answer to that, yet. Might take a
while
> I can certainly to do some reading on their work and figure out how they do it.
Yes, please! If you can then explain it to me, in email, that would
be excellent. If you can't explain it, then some paper references...
> Do you have the pattern matching tool set in github?
Yes. https://github.com/opencog/learn
Terminology: in comp-sci, "pattern matching" usually refers to a very
simple kind of matching, called "regular expressions" (regex), with
theory developed in 1960's and a standard part of Unix by the 1980's
see e.g. "perl regex".
Besides regex, many programming languages have a similar but different
idea: scheme has "hygenic macros". as do other functional languages.
Python does not _javascript_ does not. I think some of the latest and
weirdest c++ standards track is trying to go that way. C++ templates
are kind-of pattern-matcher-like-ish, but they're simple, and 30-35
years old, now.
In atomese, I made the mistake of calling it's graph rewriting system
"pattern matching". Bad mistake, because it makes people think of the
above rather simple systems. In fact, Atomese has 2 or 3 or 4 distinct
systems that, uhh, "process patterns"
At the bottom end, its the "query engine", which is a sophisticated
and fast graph rewrite engine. Tutorials here:
https://github.com/opencog/atomspace/tree/master/examples/pattern-matcher
you might find these to be .. mind-bendingly complicated. A theory
paper is here: https://github.com/opencog/atomspace/raw/master/opencog/sheaf/docs/ram-cpu.pdf
At the mid-range, there's a rule system and a unifier. The unifier
works. The rule system needs to be torched and rewritten.
At the "high-end", there's https://github.com/opencog/learn In many
ways, it kind-of-ish resembles transformers. Except that it works with
structures, rather than linear strings of data. And that kind-of
changes everything. It gets kind-of-ish similar results, but since its
also kind-of-ish completely different (because instead of working with
strings, it works with trees) its ... well, its a weird-ass
half-finished prototype. I love/hate it because I know why its great
and why it's utterly mis-designed. Its a steep hill to climb.
> I am a command line person. I would not mind even if it is a bit messy. I am a biologist by training but
> professionally I don't do biology. It would be fun for me to do some biology on the sideline of my profession.
Ah! Well, let's start small. Look at and plan what is doable and
interesting and fun.
> My shortcoming is that I am not a good coder.
Heh. I'm a *very good coder*, and so when I say "this shit is
difficult", trust me. This shit is difficult.
(yes, that's an "appeal to authority", but .. hey.)
--linas
--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To view this discussion visit https://groups.google.com/d/msgid/opencog/CAHrUA37Be-ak%3DvBrc7%2B4QXB6zYWOfGCB1BuSkxb0VFfh6N%2BNKw%40mail.gmail.com.
--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To view this discussion visit https://groups.google.com/d/msgid/opencog/CAMw3wdi2YGioOgDSiNf75pm5HY3pyUfuoFqX4pSSSEMzuj9mKQ%40mail.gmail.com.
--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To view this discussion visit https://groups.google.com/d/msgid/opencog/90A19806-4280-4A33-81CC-0A567EB0297B%40gmail.com.
|