[Trisquel-users] Re : The join command is missing the IPv4 addresses in long mixed lists of strings

2019-07-05 Thread lcerf
I see that your suspicions may be well founded. The smaller output files (30  
to several hundred kB) are clean-looking, but the largest ones (1500 kB down  
to ~700 kB) have duplicated rows, nearly exclusively.


That was no suspicions.  I was writing that you taught me something.  I  
thought the shell would spit an error.  Since it does not, it is probably  
valid syntax, doing what the user expects it to do.  However, I suspect it  
may be a "bashism", i.e., a syntax not all shells would accept.


join's output certainly has duplicates because the input files have  
duplicates (is that normal?).  Just add the option --unique (or simply -u) to  
the sort commands.


They'll have to have the duplicates removed during post-processing ... and be  
checked for errors.


Do not do that as a post-process.  As I have just written: just add the  
option --unique (or simply -u) to the sort commands.  It actually turns their  
execution faster.  That of 'join' too (smaller inputs and output).


Before I start that processing, I'll see if I can try out your script; the  
extra steps won't be any drag on the joining, as the longest times for any  
joins were still in the blink-of-an-eye category (0.044 sec. system time).


Using two named pipes may actually be faster than using two subshells (what  
happens when you put commands between parentheses)... by a constant time you  
should not care about.  Only optimize at the end, if necessary.  After  
ensuring the whole process is correct and after identifying the bottleneck  
(usually one single command).


I've been pairing up the most recent data with all of the prior data, one  
pair at a time, and that's getting tedious.


Use a Shell loop (or two).  For instance, if what you call "prior data" and  
"most recent data" are files in two separate directories and you want all  
pairs, then you can pass these two directories as the two arguments of a  
script like this one:

#!/bin/sh
mkfifo old.sorted
for old in "$1"/*
do
for new in "$2"/*
do
out=joined-$(basename "$old")-$(basename "$new")
sort -uk 1b,1 "$old" > old.sorted &
sort -uk 1b,1 "$new" | join old.sorted - > "$out"
done
done
rm old.sorted

"info join" page says that one of the target fields (but not both !) can be  
read from standard input.


One of the two input files (not "target fields": there is no such thing),  
yes.  I did it above, to give you an example.


In these repetitive joins that I'm doing now, can one of the target fields be  
read from a file that lists the other target files ?


You can do such a thing in a Shel script using 'while read line; do ...; done  
< file'.  Don't you prefer to organize files in directories and specified  
these directories, as I suggested above?


Remark: do you join files whose join fields are the whole lines (there are no  
additional fields)?  In other words, are you searching for equal lines in two  
files?  If so, then you actually want to use 'comm -12' instead of 'join'.   
'comm' is a simpler command, to compare the (sorted) lines of two files.


Re: [Trisquel-users] Problems installing GNU Health on Trisquel 8

2019-07-05 Thread elr
Then I think PureOS, https://www.gnu.org/distros/free-distros.html#for-pc,  
has a big problem, https://software.pureos.net/package/bin/green/python3-pip 


Re: [Trisquel-users] Setting up a server

2019-07-05 Thread mason
I found this guide[1] helpful in setting up a basic server to serve
static web pages. Here is a basic summary of the first steps.

Install Apache.

$ sudo apt install apache2

Configure the firewall.

$ sudo ufw enable
$ sudo ufw allow 'Apache Full'

Determine your IP address.

$ hostname -I

In your browser, go to

http://[your ip address]

and you should see Trisquel's default Apache page. To modify this page,
edit "/var/www/html/index.html".

[1] 
https://www.digitalocean.com/community/tutorials/how-to-install-the-apache-web-server-on-ubuntu-16-04


signature.asc
Description: PGP signature


Re: [Trisquel-users] Setting up a server

2019-07-05 Thread jason
To start off with a web server install Apache. sudo apt install apache2 ta  
da.


[Trisquel-users] Setting up a server

2019-07-05 Thread jbahn

Hi.

I have for long wanted to setup my own server but every time I get stock.  
There are loads of sites apparently explaining and illustrating how to do  
including https://trisquel.info/en/wiki/server


Yet I haven't found one that helps me - with my level of (lack of) knowledge  
- sufficiently to understand what is needed, how to install it and how to set  
it up.


I wish to be able to host a few sites (i.e. a number of domains) like simple  
pages, blogs etc. I would like some sort of upload via email system (i.e. I  
send an email to a specific address which posts the content on my site)


I also wish to setup my own
* email server
* encrypted file storage (with multiple users so I can host e.g. my cousins  
back-up files without me being able to know their content)

* encrypted etherpad

LATER I would like to learn how to setup
* a vpn service
* a tor node
* a video/chat instance
and more

So I want to start with a server that can host simple pages and blogs and  
later be 'upgraded' to host various features like the ones mentioned above.


First question: Which computer should I use? Naturally I want to run free  
software exclusively, i.e. on Libreboot and Trisquel.  Will it suffice with a  
T60? Will it be much better with X200 or a T400, or should I use something  
else? Are there some minimum requirements or recommendations regarding CPU,  
RAM, bandwidth etc.?


Secondly, I need to know which software to install. If it is not too  
difficult or cumbersome I would like to start by setting up a basic server  
which I can later 'expand' to the more advanced features (rather than having  
to install totally different server software later).


Thirdly I need to know how to set it all up to ensure the proper  
functionality and privacy/security.


One issue I really don't understand is how to make the server accessible from  
the internet.


I hope you guys will help me with advise, knowledge and links to good reads.  
Since I do not have much spare time, this project will probably last some  
time. On top of getting my own server, I hope to be able to make a good guide  
based on my experiences and the synthesis of your help.




Re: [Trisquel-users] The join command is missing the IPv4 addresses in long mixed lists of strings

2019-07-05 Thread amenex

Magic Banana wrote:

> I did not know it was OK to redirect twice the standard input; to avoid  
touching the disk I would have created named pipes,

> as in this short (untested) script ...

After spending the intervening time making about 150 joins, combining  
eighteen sets of four-month Webalizer data every which
way, I see that your suspicions may be well founded. The smaller output files  
(30 to several hundred kB) are clean-looking,
but the largest ones (1500 kB down to ~700 kB) have duplicated rows, nearly  
exclusively. They'll have to have the duplicates

removed during post-processing ... and be checked for errors.

Before I start that processing, I'll see if I can try out your script; the  
extra steps won't be any drag on the joining,
as the longest times for any joins were still in the blink-of-an-eye category  
(0.044 sec. system time).


I've been pairing up the most recent data with all of the prior data, one  
pair at a time, and that's getting tedious. The
"info join" page says that one of the target fields (but not both !) can be  
read from standard input. In these repetitive
joins that I'm doing now, can one of the target fields be read from a file  
that lists the other target files ? I've got
fifteen more sets of data, so this file list can grow ... up to thirty-two  
now, but almost without end if one looks at the

number of Webalizer data sets that are available.



[Trisquel-users] Re : The join command is missing the IPv4 addresses in long mixed lists of strings

2019-07-05 Thread lcerf

About the end of your post:

You cannot both read and write in a same file; your "two-step solution" is  
OK.
I did not know it was OK to redirect twice the standard input; to avoid  
touching the disk I would have created named pipes, as in this short  
(untested) script:

mkfifo file1.sorted file2.sorted
sort -k 1b,1 -o file1.sorted file1 &
sort -k 1b,1 -o file2.sorted file2 &
join file1.sorted file2.sorted > Joined-file0102.txt



Re: [Trisquel-users] Problems installing GNU Health on Trisquel 8

2019-07-05 Thread mason
> And is someone there who would be able to build such a package? Or should
> we wait till the GNU Health project does this?

I have seem some projects package their own software and distribute it
as a standalone .deb package, PPA, or apt repository. If the GNU Health
team wants to do this, it would then be easy to add GNU Health to
Trisquel's repositories. If not, we could make a request for packaging
in Debian. If a Debian maintainer takes on the task, GNU Health could be
made available in Debian and all downstream distributions, including
Ubuntu and Trisquel. If that doesn't work, a third party could package
it in a PPA. I've learned how to create very simple Debian packages, but
glancing at the GNU Health source code it looks outside my skill level.

> An one more question chaosmonk, as I have seen in one of your links,
> should something like this work in order to try to install GNU Health on
> Trisquel 8?:
>
> $ gnuhealth-3.4.0.tar.gz

Did you mean to say

> $ tar xf gnuhealth-3.4.0.tar.gz

?

> $ cd gnuhealth-3.4.0/
> $ sudo apt install python-setuptools python3-setuptools

Installing setuptools is necessary if this is the first time you've
compiled a program that uses setuptools. In the future you can skip this
step.

> $ python setup.py build
> $ sudo python setup.py install

Those notes aside, this should work *if* you have all of the necessary
dependencies installed. If you are missing some dependencies, you'll
have to install those first. If a missing dependency is in Trisquel's
repository, you can just install it with apt. If not, you'll need to
build it too. Why don't you give it a try, and "python setup.py build"
fails, copy/paste the error message here.


signature.asc
Description: PGP signature


Re: [Trisquel-users] The join command is missing the IPv4 addresses in long mixed lists of strings

2019-07-05 Thread amenex

In the process of proving myself wrong, I did the following experiment:

1. Make a shuffled version of the file GBsmt-front.txt.
2. Make its number of lines divisible by four (by temporarily deleting one  
line).
3. Divide GBsmt-front-shuf into four parts (A,B,C,D); replace the deleted  
line into part D.
4. Sort each of the four parts with the console (previously I had been  
believing the sorted output of LibreOffice Calc).

5. Sort GBsmt-front.txt (again, just to be sure, with the console).
6. Run the join command four times, with the A, B,C, & D portions of  
GBsmt-front-shuf, against GBsmt-front.txt.
7. Interim reality check: The sum in kB of the four output files equals the  
size of the original GBsmt-front.txt file.
8. For-sure reality check: Concatenate the four outputs of the above join  
command, sort, and compare to the original GBsmt-front.txt list.


After all this manipulation, the two files (GBsmt-front.txt and  
GBsmt-front-shuf-(A,B,C,D-join-concatenate-sort.txt) are identical.


Then I tried the original task, and now the IPv4 addresses appear in the  
joined output. As long as I sort each of the files
to be joined right before the join operation, the command doesn't complain  
... and the IPv4 data appear in droves.


Thanks to Magic Banana for confirming that join doesn't have undisclosed  
limitations.


Another tidbit: Sorting a file with "sort [file]" alone sends the sorted  
output to the console; "sort [file] > itself" gives O bytes output.
My two-step "solution": "sort [file] > [file-newname]" then "mv  
[file-newname] [file]" preserved the original file and its name and left
no residue. The correct way to do this in line with the join command is "join  



Re: [Trisquel-users] Problems installing GNU Health on Trisquel 8

2019-07-05 Thread ricardo63
And is someone there who would be able to build such a package? Or should we  
wait till the GNU Health project does this?
An one more question chaosmonk, as I have seen in one of your links, should  
something like this work in order to try to install GNU Health on Trisquel  
8?:


$ gnuhealth-3.4.0.tar.gz
$ cd gnuhealth-3.4.0/
$ sudo apt install python-setuptools python3-setuptools
$ python setup.py build
$ sudo python setup.py install

And if yes should I put these commands all together? Sorry for the newbie  
question :-(


Thanks and best regards,
Ricardo


Re: [Trisquel-users] Italian administration goes Open Source, and that will benefit Free Software

2019-07-05 Thread Ignacio Agulló
On 05/07/19 09:42, Narcis Garcia wrote:
> You've described a common environment with Spain, France, Greece, etc.
> including same proverb and including Windows XP.

Right.  Out of curiosity, I sought for the origin of the proverb...
and of course, it comes from Latin: inventa lege, inventa fraude.

-- 
Ignacio Agulló · agu...@ati.es



0xC6AB2D51.asc
Description: application/pgp-keys


signature.asc
Description: OpenPGP digital signature


Re: [Trisquel-users] Italian administration goes Open Source, and that will benefit Free Software

2019-07-05 Thread Narcis Garcia
You've described a common environment with Spain, France, Greece, etc.
including same proverb and including Windows XP.

I suppose other people can identify this in more countries too.


El 4/7/19 a les 11:25, vanacksabbad...@gmail.com ha escrit:
> I'm italian, so this is an amazing news for me as an italian citizen.
> However, i know italian people. We say "fatta la legge, trovato
> l'inganno", that means literally "created the law, found the trick".
> When the article says that "public administration MIGHT use proprietary
> software but it has to justify it", i read it as "everyone will found a
> way to use proprietary software by finding excuses to not use free and
> open source software". Why? Because that's how italian people usually do
> things. And that is sad and shameful.
> 
> I hope someone will care about open source enough to make a statement
> and a change in italian administration, which is very obsolete (some
> offices still use Windows XP...) as today.