Barry---
You make a good point, I completely skipped ICON.
I used Snobol for years and then one day I
wondered if there had been any updates to the
language and ran in to Icon and Unicon, but by
that time Unicon already existed.
Basically I use the Icon subset of Unicon if the
program is very large, but mostly I just use the
string parsing in a single straight program without any function calls.
I never saw the ICON book. I think the online PDF
books are great, normally I find what I need in one of them.
But it is true that if I had to buy a book I would probably stuck to Snobol.
I kind of like Snobol because it reminds me of assembly language programming.
I tell people about Snobol/Icon/Unicon all the
time, because it is such a great tool, but no one is ever interested.
And I don't understand why, because if you need
to manipulate text it is the only way to go.
I have been working on a text related project
recently and tried to do some searches in VBA for Word.
That is really despicable. I just could not get
at what I needed, because you can grab only
characters, words or paragraphs, no LINES because
the lines are in a flow. And then you have to
manipulate either selections or ranges. And then
once you have the ranges then you only have very
primitive string manipulation functions to deal with them.
I asked around various people and no one I know
does any VBA programming in Word, everyone uses VBA with excel.
When I mentioned Snobol/Icon/Unicon to the people
I was talking to about my problem, I got only
blank stares until I got to Snobol and then
everyone says at that point, that they thought
that was just a relic of computing history, and
they are shocked that there were follow on languages in that family.
I had to solve my problem outside of Word. I did
not in this case solve it with Unicon, but with a
concordance program. I have a wonderful
concordance program from
http://www.concordancesoftware.co.uk/ which did
the trick, because the text was so malformed that
it would have taken for ever to write a program
that would recognize all the cases.
But I still had to do a lot of hand manipulation,
in order to create the files I needed from the concordance program output.
It seems to me no one has really solved the
problem of text manipulation on the fly.
I have to have a certain size problem before it
is worth writing a program in Unicon to solve the
problem. I am constantly using multiple tools to
process the data and massage it into a needed
format on all the text that is not worthy of
writing a Unicon program to process.
So it seems to me that I must not be the only one
doing this. I think there must be a huge hidden
market out there for something that manipulates
text which is more powerful than regular
expressions and search, but less of a problem
than programming in Unicon. I have seen several
suites of special purpose tools (for example
http://www.boxersoftware.com/textmonkey.htm ),
but I have never seen anything that solves this
problem between what is possible by using
spreadsheets together with wordprocessing
programs and what is possible using UNICON.
Perhaps if Unicon addressed this hidden market
somehow, it could find its niche. Instead of
program examples we need something like templates
by which programs could be altered to do slightly
different processing. The problem is that tools
are too specific, but to get variation you need
the complete generality of the programming
language. There should be something between these
extremes where there were adaptable tools that
were flexible and changable but still not general.
This is the idea of Domain Specific Languages but
I don't think there is anything equivalent to
that in the text manipulation world.
I was hoping that XML would fill the bill but it
really only makes a bigger problem, because most
of the texts do not have tags, and placing the
tags in the text is just as big a problem as
writing a text recognition program.
For instance, in my latest effort I tried
producing the XML output of Word 2007 and then
unzipping it go get the core document. But that
XML had the text so split up that it was going to
be a bigger problem than just dealing with the text in bulk.
Once you get outside of Word by dumping a txt
file then many times the text is so screwed up that it is difficult to parse.
So if the TXT is screwed up and the things
searched are malformed, then building a text
recognizer is complicated. So that is where the
concordance program came in handy.
I wish that once I had the concordance program
output that I could then parse that. But it was
not worth the time write that program even though
the concordance program produces txt and html versions of the concordance.
Anyway I know I am waffling a bit, but it just
seems that there is a middle ground of text
manipulation for which the tools are missing.
I heard today that 161 extabytes (10^18) of data was produced this last year.
In 2003 it was 5 extabytes.
It seems that with all that data, there should be
a market for a good text manipulation language.
Especially one that supported some sort of middle
level tools for problems where writing a program
is not possible, but there is a lot of text processing to be done by hand.
Perhaps there is some level between text and xml
which we are missing out on, but where there is a
significant amount of work performed.
I know that many times I am using multiple tools
to get the final result I am seeking, were I am
massaging the document many times with different
tools and in multiple passes where each pass does
something a little different to it toward achieving the final result.
For instance, one trick I am sure you have all
used is to do search and replace or insert into a
text document to put in the hooks that I would
search for with my text recognizer. Sometimes
that is tricky with wildcards. An excellent tool
in this regard is SR
http://www.funduc.com/search_replace.htm for
search and replace across multiple files.
Kent Palmer
X-Verify-SMTP: Host 66.35.250.225 sending to us was not listening
Date: Fri, 9 Mar 2007 05:21:39 -0600
From: [EMAIL PROTECTED]
To: unicon-group@lists.sourceforge.net
User-Agent: Mutt/1.5.13 (2006-08-11)
X-Spam-Score: 1.2 (+)
X-Spam-Report: Spam Filtering performed by sourceforge.net.
See http://spamassassin.org/tag/ for more details.
Report problems to
http://sf.net/tracker/?func=add&group_id=1&atid=200001
0.2 NO_REAL_NAME From: does not include a real name
1.0 FORGED_RCVD_HELO Received: contains a forged HELO
Subject: Re: [Unicon-group] Ruby Python vs. Icon/Unicon
X-BeenThere: unicon-group@lists.sourceforge.net
X-Mailman-Version: 2.1.8
List-Id: Unicon programming language discussion list
<unicon-group.lists.sourceforge.net>
List-Unsubscribe:
<https://lists.sourceforge.net/lists/listinfo/unicon-group>,
<mailto:[EMAIL PROTECTED]>
List-Archive:
<http://sourceforge.net/mailarchive/forum.php?forum=unicon-group>
List-Post: <mailto:unicon-group@lists.sourceforge.net>
List-Help: <mailto:[EMAIL PROTECTED]>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/unicon-group>,
<mailto:[EMAIL PROTECTED]>
Sender: [EMAIL PROTECTED]
X-Server: High Performance Mail Server - http://surgemail.com r=1653887525
X-Rcpt-To: <[EMAIL PROTECTED]>
X-SpamDetect-Info: This message may be spam see
http://www.smitespam.com for more information
X-SpamDetect: *: 1.189000
From4consonants=1.0,From: does not include a
real name=0.3,X-Verify-SMTP present=0.6,Aspam=-0.8
X-NotAscii: charset=us-ascii
X-Avast: Message is clean
X-IP-stats: Incoming Last 0, First 6, in=54, out=0, spam=0
X-External-IP: 66.35.250.225
X-ChoiceMail-OriginalAccount: [EMAIL PROTECTED]
X-ChoiceMail-AcceptedReason: Mailing List Message
bryan rasmussen <[EMAIL PROTECTED]> wrote:
> I'm a newbie with Unicon, I basically decided to start with it
> because, well I like learning new languages especially ones with an
> easily perceived niche. I think the niche of Unicon as you say is text
> processing.
>
> I use XML a lot in my day to day, I'm not sure I understand the
> assertion that Unicon would be great for XML, since the main thing one
> needs for XML programming is easy tree manipulation, such as is
> provided with XSL-T.
There's an XML parser in the uni/xml directory of the CVS. I haven't
used it a lot, though it is likely I will be soon.
I agree with Kent Palmer that the syntax of Icon/Unicon isn't great,
but I'm not sure how much effect that has on popularity, considering
that languages with significant syntax difficulties, such as Perl and
C++, have been adopted anyway. Personally, my best hypothesis would be
that Icon hasn't become more popular mainly because you had to go buy
the book, 'The Icon Programming Language', and thus only the most
curious people ever tried Icon, for they had to go to some effort and
spend money doing it. Perl is an example of a language that became
popular in part because you could try it without buying a book, though
of course you might buy a book or two later. The copylefted Unicon
book as a PDF is a start towards a remedy, but you need more stuff
similar to the Perl manpage(s), the GNU info pages, on-line HTML
tutorials and documentation, and so forth.
--
Barry.SCHWARTZ Äe chemoelectric punkto org http://chemoelectric.org
Free stuff / Senpagaj varoj: http://crudfactory.com
'Democracies don't war; democracies are peaceful countries.' - Bush
(http://www.whitehouse.gov/news/releases/2005/12/20051219-2.html)
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Unicon-group mailing list
Unicon-group@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/unicon-group
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.3 (GNU/Linux)
iD8DBQFF8UNDBNGXDWV0vIMRAt0MAJ91CTXOmYnavH4srIMhExqDWy8G8gCdEVhM
ca9z4+zbzzjKDN0IDgJpO4c=
=pAqH
-----END PGP SIGNATURE-----
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Unicon-group mailing list
Unicon-group@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/unicon-group