My first module

2015-07-14 Thread Rolf Holte
I've made some scripts to harvest (web scrape) metadata on Digitalarkivet
(DA). Since the task is formidable I've split it into stages and use
several scripts for each stage, common stuff is put into 2 scripts for
reuse  to keep scripts cleaner/more readable. These 2 scripts are always
included in my scripts, and are a candidate for a module. I'm thinking of
making these into 1 or 2 modules.The concept works on first 2 stages (just
need to code more the rest).

Mainly have five questions (seek advice on these matters)

1) One or two modules?

Is it a good idea to split database operations into a separate module. I
use DBI and try to avoid non-standard SQL most SQL is basic SELECT or
INSERT/ LOAD DATA (more advanced SQL is placed in stored procedures, and I
call these when more intricate tasks are needed) So far I've got 4 subs in
DA.pl and 20 subs for DA-DBI.pl. There will be more methods when I code
for 2 next stages.  I always need both scripts for my use. Can't really run
without database in back-end (although I often opt storage to file.. mainly
for either temporary/speed issues or debugging/informational purpose)

or

should I just put everything into 1 module since config file can alter
database (from MySQL to anything also supported by DBI, some minor things
are mysql dependent, and could instead be moved to stored procedures )

2) Should it be a module at all?

Since I heavily depend on database back-end should it be a module of its
own? I need to reuse code for many tasks (different scripts) in order to
web scrape metadata on the site. Is it more an App?


3) Namespace

Not quite sure if I'm going to release all code to scrape site. I've put
code in several scripts which may or not be included along side with my
module(s). The 2 main reason's are it took me 4 days to scrape site first
time. Don't want everyone to scrape whole site just for fun. secondly not
completely confident that everyone would respect my licence. I'm happy to
share on non-commercial basis. But would like something in return if used
commercially If it's released as an app (working code for everyone) then
APP namespace should be used if I understood pause_namingmodules.
Otherwise depending on one or modules I've been thinking of DIS::DA 
DIS::DA::DBI (DIS is the acronym for the Genealogy society I'm a member of,
and making code for. DA is a known acronym for Digitalarkivet (Digital
Archive of Norway).  If one module DigitalArkivet.pm might be the best
choice?


3) Best practice for POD?

As a newbie on POD, I've put the pod in between in code, reducing the
need for (extra) header comments on subs. The POD documents the code of
each sub, as a header to each sub. Most POD I've seen puts all pod at the
end of the file. (Both can be done, but is the latter highly recommended /
BEST practice?) I find it easier to write POD when I see what is going on,
also it forces me to write POD at once.. I could copy everything to the end
of the file, before release, but then I feel I've got to (re)write header
documentation on each sub.

4) To CPAN or not to - Licence

My first thought is to licence it as something like this:

DA-DBI.pl by Rolf B. Holte is licensed under a Creative Commons
Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions
beyond the scope of this license may be available at
http://dev.perl.org/licenses/artistic.html.

Why? I'd like to share code but not for commercial use?

Would that be OK, or do I have to use Perl/ artistic license to put on
CPAN? Can I prohibit commercial use?
-- 
rbh


Re: My first module

2015-07-14 Thread Shlomi Fish
Hi Rolf,

I've de-CCed modu...@perl.org to avoid bothering them.

On Mon, 13 Jul 2015 23:20:57 +0200
Rolf Holte rolf.ho...@gmail.com wrote:

 I've made some scripts to harvest (web scrape) metadata on Digitalarkivet
 (DA). Since the task is formidable I've split it into stages and use
 several scripts for each stage, common stuff is put into 2 scripts for
 reuse  to keep scripts cleaner/more readable. These 2 scripts are always
 included in my scripts, and are a candidate for a module. I'm thinking of
 making these into 1 or 2 modules.The concept works on first 2 stages (just
 need to code more the rest).
 
 Mainly have five questions (seek advice on these matters)
 
 1) One or two modules?
 

[SNIPPED - no idea]

 2) Should it be a module at all?
 
 Since I heavily depend on database back-end should it be a module of its
 own? I need to reuse code for many tasks (different scripts) in order to
 web scrape metadata on the site. Is it more an App?
 

Well, do you want to write a .pm file (which is often a good idea) or do you
want to prepare a CPAN-like distribution? Maybe see:

http://www.slideshare.net/thaljef/cpan-for-private-code

(There are more similar links here - http://perl-begin.org/topics/cpan/ .)

 
 3) Namespace
 
 Not quite sure if I'm going to release all code to scrape site. I've put
 code in several scripts which may or not be included along side with my
 module(s). The 2 main reason's are it took me 4 days to scrape site first
 time. Don't want everyone to scrape whole site just for fun. secondly not
 completely confident that everyone would respect my licence. I'm happy to
 share on non-commercial basis. But would like something in return if used
 commercially If it's released as an app (working code for everyone) then
 APP namespace should be used if I understood pause_namingmodules.
 Otherwise depending on one or modules I've been thinking of DIS::DA 
 DIS::DA::DBI (DIS is the acronym for the Genealogy society I'm a member of,
 and making code for. DA is a known acronym for Digitalarkivet (Digital
 Archive of Norway).  If one module DigitalArkivet.pm might be the best
 choice?
 

DA could also mean District attorney, DeviantArt (see
https://en.wikipedia.org/wiki/DeviantArt ) and lots of other stuff so it's
better to be more explicit.

 
 3) Best practice for POD?
 
 As a newbie on POD, I've put the pod in between in code, reducing the
 need for (extra) header comments on subs. The POD documents the code of
 each sub, as a header to each sub. Most POD I've seen puts all pod at the
 end of the file. (Both can be done, but is the latter highly recommended /
 BEST practice?) I find it easier to write POD when I see what is going on,
 also it forces me to write POD at once.. I could copy everything to the end
 of the file, before release, but then I feel I've got to (re)write header
 documentation on each sub.

The book Perl Best Practices by Damian recommends putting all the POD at the
end, but there isn't a general consensus among the Perl community for it. I
for once, am content with either way. Note that tools like
https://metacpan.org/release/Dist-Zilla and
https://metacpan.org/release/Pod-Weaver can help a lot with maintaining POD .

 
 4) To CPAN or not to - Licence
 
 My first thought is to licence it as something like this:
 
 DA-DBI.pl by Rolf B. Holte is licensed under a Creative Commons
 Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions
 beyond the scope of this license may be available at
 http://dev.perl.org/licenses/artistic.html.
 
 Why? I'd like to share code but not for commercial use?
 

First of all, note that I am not a lawyer (IANAL) and This is not legal
advice (TINLA). That put aside:

1. The Creative Commons organisation recommends against using its
CC-by/CC-by-sa/CC-by-nc/CC-by-nc-sa/CC-by-nc-nd/CC-by-nd licences for licensing
source code. So you may wish to use a different licence.

2. A http://en.wikipedia.org/wiki/Free_and_open-source_software licence may not
prohibit commercial use. See:

* http://opensource.org/osd

* https://www.gnu.org/philosophy/selling.html

Note that there's some provision against making some types of FOSS code
proprietary in https://en.wikipedia.org/wiki/Copyleft , but it does not equate
to prohibiting all commercial use.

3. I recall reading that all the source code that is uploaded to CPAN should be
FOSS.

 Would that be OK, or do I have to use Perl/ artistic license to put on
 CPAN? Can I prohibit commercial use?

You can use any free-and-open-source-software licence, and you should opt to
use the Artistic License version 2.0 (See
https://duckduckgo.com/?q=artistic%202.0 ) rather than the original Artistic
License, which the FSF considers non-free here -
https://www.gnu.org/licenses/license-list.html . You cannot prohibit commercial
use, if you use Artistic 1.0/2.0. If you wish to do so,  you should use a
different licence *and* consider not putting your code on CPAN.

Regards,

Shlomi Fish

--