[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-03-02 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-593282253
 
 
   Hello @vsoch - I spoke to co-organizer of Airflow Summit - @leahecole - and 
I think we can figure out something with regards you being speaker at the 
Summit. @leahecole -> can you please reach out to @vsoch and take it from here. 
I would love to make this happen!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-23 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-590053213
 
 
   I see, pity, let's see. It's the first-time event so we do not know what to 
expect and for now do not  foresee reimbursements, but if we are successful 
with finding sponsors, who knows, maybe we will introduce a diversity fund. I 
will keep you updated :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-23 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-590051343
 
 
   BTW. @vsoch I am going to give a talk together with @aijamalnk on Diversity 
and building even more welcoming community at the Airflow Summit 
(http://airflowsummit.org). And I will certainly  use some of the things we've 
learned during this conversation. But I have just a thought - since you are 
nearby (I guess :) ) maybe you would like to give some talk about your side of 
the experience (and a bit about Singularity). I think that might be cool. 
   
   We have CFP open : 
https://pretalx.com/apache-airflow-summit-bay-area-2020/cfp :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-23 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-590050910
 
 
   Indeed Awesome work 🎉 🎉 🎉 🎉 🎉 🎉  . Likewise :). Thanks for cooperation on 
that one !


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-22 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-590033385
 
 
   one transient error - but I restarted it. Looks good !
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-22 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-590033396
 
 
   :crossed_fingers: 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-22 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-590026699
 
 
   It could be a transient error or some temporary problem with master (despite 
of the protections we have in place it breaks occassionally). We do not see it 
now in the latest master, so I suggest to rebase to the latest master (it's 
green: 
https://travis-ci.org/apache/airflow/builds/653784320?utm_medium=notification&utm_source=github_status)
 and see if the problem is still there.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-21 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-589639710
 
 
   > looks like the error linked above was one off - a later build didn't 
trigger it.
   
   Yeah. We still have some flaky tests - much less than before but they happen 
sometimes due to resource problems we have with Travis :(. We are close to 
migrate to Github Actions but it's one of the things that gets delayed as we 
are busy with other higher-priority stuff. It's annoying but it's not a blocker
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-19 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-588461323
 
 
   > One quick note about the docs for pylint - the files are now changed to 
have pylint_main.sh and pylint_tests.sh. Also, this unexpectedly started that 
huge build process that failed before. Given a .pylintrc file in some root and 
running pylint, it would be great to see a much simpler (not wrapped / 
abstracted) run of pylint on a single file.
   
   I'd really recommend (again) pre-commits. They will run pylint on all 
modified files only. This is far better than having to run it manually - the 
will simply run pylint (or whatever of the other tests) automatically on all 
the files you modified when committing it. You don't even have to remember 
about it. And the reason for the build process is to make sure that you run it 
in the same environment as Travis CI and other people. The problems with tools 
like pylint is that the results depend often on installed libraries, pylint 
version etc. For example you need to have installed stubs for python libraries 
in order to do full static analysis. So there is huge variability depending on 
your local environment and you can simply get different results locally than on 
Travis. That's why we run pylint tests inside the docker container which 
mirrors this container that runs on Travis. This is the only way to make sure 
everyone runs in the same environment.
   
   It happened many times that people reported "I see different errors than 
those on Travis" - and it turned out that they have different pylint or python 
version. This is the real benefit of the "huge build". you start to appreciate 
it when your team of contributors is totally distributed, maintain their own 
environment and consists of 600 individuals - each with their own setup.
   
   Simply - it looks differently from the point of view of single person and 
differently from the point of view of team of few hundred people. Sometimes 
efficiency of team require some sacrifice of efficiency of single individual. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-19 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-588435931
 
 
   > How would you like me to address this?
   > 
   > ```
   > 
PATH=/root:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/hive/bin:/usr/local/go/bin:/opt/gcloud/bin
   > 
   > * Module airflow.providers.singularity.operators.singularity
   > 
   > airflow/providers/singularity/operators/singularity.py:65:4: R0913: Too 
many arguments (12/10) (too-many-arguments)
   > ```
   > 
   > I had added working_dir to kwargs, but then during the test there was an 
error that the argument wasn't found. So I added it, but now I'm over.
   
   The pylint defats we have are 'reasonable' i.e work in most cases but if you 
have a justified case and do not think refactoring / using a structure instead 
of n-parameters you could disable pylint check - this specific check for this 
statement only by adding appropriate # pylint: disable comment. This is a sign 
from the programmer that this is a deliberate decision and theat it is 
conscious choice to disable tis rule. You can find plenty of examples in the 
code and more detailed explanation including some guidance on the decision 
making in the 
https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst in pylint 
section


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-18 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-587354436
 
 
   Hey @vsoch a bit unrelated, but wanted you to know that we have just merged 
this update to CONTRIBUTING documentation to make ti easier for people like you 
to understand the dynamics of our project's communication:  
https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#how-to-communicate


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-15 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-586675640
 
 
   I think we do not have it explained very well - and I am not 100% sure 
myself. But @mik-laj can help with that.
   
   One more thing @mik-laj - maybe we should add a short explanation/howto on 
how to add the documentation - where things should be excluded, how to add new 
packages and guides? For now it's not at all obvious what steps should be taken 
in case a new provider's package is added. A short "how to add new package 
documentation" would be great in CONTRIB docs. It does not have to be long - 
just to explain what to do.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-15 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-586641854
 
 
   Sorry - devl...@airflow.apache.org - see CONTRIB.rst. I was typing it on my 
tablet and autocomplete kicked in.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-15 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-586640329
 
 
   > Conflict resolved - it was the same line as before, another package was 
added! I added a new line (it was getting too long) so hopefully that won't 
make the linter unhappy...
   
   yep. it is a very active project. so conflicts are expected. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-15 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-586633715
 
 
   and you need t rebase as well ontop of latest master & you have conflict 
with setup.py


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-15 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-586633655
 
 
   > Questions:
   > 
   > * where does example_singularity_operator.py go - it used to be under 
contrib/example_dags but now, that entire folder is empty?
   
   in the meantime we completed AIP-21 and every integration is now moved to 
'providers' package.Singularity will need to be moved to providers/singularity
   
   > * this error is triggering for the docs (is it related to the above)? I 
don't see any script_files with grep, nor do I see how super() relates to the 
example_singularity_operator.py file.
   
   just look where any of the other providers are added - there is a TOC where 
you can navigate to your documentation - otherwise users won't be able to find 
the docs.
   
   > 
   > ```
   > writing additional pages...  
search/usr/local/lib/python3.6/site-packages/sphinx_rtd_theme/search.html:20: 
RemovedInSphinx30Warning: To modify script_files in the theme is deprecated. 
Please insert a 

[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-14 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-586169342
 
 
   > High level feedback since this was a topic earlier:
   > 
   > * the extra linting for the Dockerfile, which comes down to fairly trivial 
things like using cd in a command for one layer and using curl instead of wget, 
has led to several additional errors that are frustrating to deal with. If I 
knew the project perfectly sure I might have known these in advance, but for a 
new contributor that gets these bugs (which are somewhat subjective) that then 
has to figure them out and fix, it's an extra pain that simply doesn't need to 
exist.
   > * given the long duration of a PR, the addition of new linters / tests 
makes it even more confusing.
   > 
   > I'm happy to keep providing this feedback, and if you don't want it let me 
know!
   Feedback is welcome :).
   
   I understand it's a bit of pain with linting, but I would treat it as a 
great learning experience. For me this is the best trainer I have - when I make 
a mistake or don't do something very well and can do better, I get feedback. 
That's the case with linting - I've learned a lot of good practices from 
linting - hadolint for Docker and especially shellcheck for bash. 
   
   And especially that those lint rules have very good explanation rationale 
and most of them has the effect that they prevent errors that others might have 
problems with. For example "using cd" has a good explanation and I think it's a 
fantastic practice to use WORKDIR instead of cd in Dockerfiles as it makes it 
more controllable - you always know which directory you are in without having 
to parse the shell statement. Here is more detailed explanation with example 
and rationale: https://github.com/hadolint/hadolint/wiki/DL3003 
   
   You might of course disagree with particular rules (and you have full right 
to do so) but we also have the rule in community "disagree but engage" - 
community decisions are more important than individual preferences. That's one 
of the things that I had to learn when I joined the community - there are a 
number of things I do not agree with but then after community decided otherwise 
i follow and even engage with). On the other hand there were a few rules that 
were annoying and we raised a proposal to remove them eventually, so there is 
still place for discussing them in the devlist - and if you have good reasoning 
why the rules should be removed, I am sure people who will be interested in it 
will have opinions and maybe decision will be made to disable those rules.
   
   * Regarding adding new checks - here again community is more important. When 
you look at it from your individual point of view, it's bad, but if you look at 
it from the community, there are more than 100 PRs opened at any time so we 
would never had a chance to improve - steady improvements are crucial, if you 
don't improve you actually deteriorate, so while annoying, it's also expected 
that after rebase you might have to adapt to some new rules. That's actually a 
sign of healthy project. Also if you had pre-commits installed and working, 
those "encoding" lines would be removed automatically for you. Whenever we add 
new rules we try to make them self-fixing and having pre-commit is one of the 
reasons for that.
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-13 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-585964299
 
 
   > 1. installing Singularity as a binary inside a container (these bases 
already exist)
   > 2. running the container via docker-compose
   > 3. But then running tests inside of that container
   > 
   > Is this possible?
   
   I see now !. In this case I think the easiest is to add the binary to Docker 
image and running it from there. Then there is no need to run it as separate 
image/runtime. That will be way simpler. We already  do that for the 
minicluster - for hadoop tests:
   
   Docker installation here: 
https://github.com/apache/airflow/blob/master/Dockerfile#L184
   And starting the cluster here: 
https://github.com/apache/airflow/blob/master/scripts/ci/in_container/entrypoint_ci.sh#L140
   J.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-13 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-585943248
 
 
   > okay just to clarify - you want a Singularity + Airflow container run via 
a similar kind cluster? You said something about adding specific commands 
`--start-singularity` and `--stop-singularity` and I'm not sure what / where 
that is referring to, and how / why we would want to start or stop singularity 
(it's not a service, it's just a binary installed).
   
   I assume we can start singularity from the host and be able to forward this 
connection to inside the airlfow-testing container so that we can connect to 
it. With Kind - we are starting it from within the container (by forwarded 
docker socket) , but it could be started from the host as well (mongo, kerberos 
and others are started from the host using `docker-compose` configuration and 
then we can connect to them from the "airflow-testing" by specifying their 
service names (mongo/kerberos etc).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-13 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-585941481
 
 
   > okay so I'm tracing the kubernetes (runtime) as an example, and I have a 
quick question. In scripts/ci/in_container/entrypoint_ci.sh I see that given 
the kubernetes runtime, you are setting some variable for a container name to 
be the airflow name with suffix "-kubernetes."
   > 
   > and this would suggest the container is brought up inside of this 
container? Wouldn't it make more sense to have a docker-compose run that runs 
some equivalent of airflow-testing but with Singularity installed inside?
   
   Yes it would make sense - similar to what we do with other integrations (say 
kerberos or mongo). What we are doing with kind is a bit different - we are 
forwarding the host's docker credential to inside the airflow-testing container 
and we are running the kind command line from inside it. This - in effect - 
reaches out to the host's docker and set's up the kind cluster using host's 
docker engine.
   
   With singularity - if it can be run in the host and we can connect to it 
from inside the airflow-testing container that would be best.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-02-13 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-585939693
 
 
   Somehow  I missed it completely. Sorry. Responding now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-01-28 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-579443300
 
 
   > Could you give me some pointers about where to write the Dockerfile, what 
example from the current codebase (link) is what I should try to copy the logic 
for?
   
   I think the closest one is Kubernetes and I am leaning towards creating 
another RUNTIME (singularity). There are few of scripts to update. But it's not 
that hard at the end.
   
   I think a custom `--start-singularity` `--stop-singularity` switches should 
be added similar to `--start-kind-cluster` `--stop-kind-cluster`.
   
   It's best if you look for RUNTIME capital wherever it's used but here is the 
guide
   
   Breeze (for local development and testing your solution):
   
   - https://github.com/apache/airflow/blob/master/breeze-complete#L40 -> 
options to add runtime are added here amongst others
   - https://github.com/apache/airflow/blob/master/breeze#L203 -> Displaying 
chosen RUNTIME
   - https://github.com/apache/airflow/blob/master/breeze#L595 -> Flags of 
Breeze
   - https://github.com/apache/airflow/blob/master/breeze#L1007 -> that's where 
RUNTIME leads to adding extra docker compose
   - https://github.com/apache/airflow/blob/master/breeze#L391 -> help 
describing the options.
   
   Docker Compose:
   - 
https://github.com/apache/airflow/blob/master/scripts/ci/docker-compose/runtime-kubernetes.yml
 - similar should be created for singularity and used in Breeze.
   
   In container:
   - 
https://github.com/apache/airflow/blob/master/scripts/ci/in_container/entrypoint_ci.sh#L164
 - entrypoint for the CI where Kubernetes is started after choosing the 
RUNTIME="kubernetes" here is where the setup for singularity can be done. It 
runs 
https://github.com/apache/airflow/blob/master/scripts/ci/in_container/kubernetes/setup_kind_cluster.sh
 under the hood. 
   
   That's it for interactive "local" breeze session. T
   
   Then if you run tests the script goes further. It builds airflow kubernetes 
image and deploys it to kubernetes. In interactive session, the deployment 
should be run manually as described in 
https://github.com/apache/airflow/blob/master/TESTING.rst#running-tests-with-kubernetes
 
   
   but if you run tests in travis (RUN_TESTS="true") those steps are executed 
manually.
   
https://github.com/apache/airflow/blob/master/scripts/ci/in_container/entrypoint_ci.sh#L218
 you have the deployment of image to kubernetes: 
https://github.com/apache/airflow/blob/master/scripts/ci/in_container/deploy_airflow_to_kubernetes.sh
 and one of the scripts it runs 
https://github.com/apache/airflow/blob/master/scripts/ci/in_container/kubernetes/docker/rebuild_airflow_image.sh
 - that's where the Kubernetes image is built.
   
   I hope that's enough leads.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-01-28 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-579315631
 
 
   Thanks @vsoch ! Really sorry for not getting back before - I've been 
travelling and missed that last comment.
   
   Fantastic work! I really liked how you explained the steps you did to make 
it works! Kudos!
   
   > So, although I'm hesitant that something about travis won't provide us 
with all the permissions that we need, I'm going to give it a try to add a 
docker compose file to reproduce this. My question for you is how I should go 
about providing this integration - the examples given previously expose 
services, and for this case we would want an environment with airflow and 
singularity in the same space. Do you have thoughts or examples for how to 
approach this?
   
   Travis and any other environment will provide all that we need. We already 
run in privileged mode (and pretty much all CI systems allow that in order to 
make all kinds of docker-in-docker approach). In fact what we already do is we 
are mounting docker socket to within the airflow-testing so that you can even 
run docker commands to reach out the host docker engine. This is needed to run 
kind (Kubernetes-In-Docker).

   > And then given that we can add a custom container, it would need to use 
one of the Singularity bases (note they are alpine), and then also include 
dependencies for airflow (or is the install already handled given the 
container, and happens before the testing?) Either way, if customization is 
needed on top of the Singularity base (that isn't handled by the testing suite) 
what would be the best way to do this - building the image elsewhere and using 
here, or building on demand?
   
   Hmm, I'd assume that having a base Singularity should be enough and that we 
can reach it via "singularity" host name via docker compose. But I understand 
you need to build in some Airflow dependencies in? Am I right? What kind of 
dependencies you need to build in that container? Is this like local airflow 
sources or some airflow dependencies? Excuse my ignorance as I do not know much 
about singularity. 
   We have few approaches we can get:
   1) In case of Kubernetes (Kind) we are using RUNTIME="kubernetes" env and we 
are checking that in the initial CI scripts - if the RUNTIME="kubernetes" we 
will install kubernetes-in-docker cluster and provision it with the latest 
Airflow locally built CI image and then we run our tests with this.  This is 
most complex thing we do and I hope we do not have to do the same.
   
   2) If those are about some small packages to install at running singularity 
after it is started already, we could simply do it in the CI scripts when we 
start up and see "INTEGRATION_SINGULARITY" is set to "true" (this will be 
properly set to true if you add it similarly to other integrations of ours and 
when --integration singularity switch is used).
   
   3) If you need something pre-backed in the Docker image itself then I think 
we need a separate Dockerfile with FROM: singularity and some steps to build 
the dockerfile before docker-compose is run (and refer to the newly built image 
in Docker compose. My best bet if that is the case, that would be that you 
build and test such images locally for local testing and once it works I can 
help to integrate it in the CI scripts properly (You can try yourself - this 
will likely be just appropriate 'docker build' command executed only when 
--integration singularity is used)
   
   
   I hope this is helpful!
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-01-21 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-576861320
 
 
   That's OK!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-01-20 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-576477396
 
 
   > But one quick note - regardless of the testing troubles, the support on 
this PR has been absolutely excellent, night and day when compared to my 
previous experience. I'm really appreciative of that :)
   
   Thanks for the kind words :). The trouble with that it that we can't afford 
to spend that much time  for every single new person coming :(. The project 
needs to be scalable. That's why my goal is to have it all as smooth as 
possible and as easy to self-study and self-discover as possible - only 
reaching out to mentors/committers/other experienced contributors when really 
needed. 
   Of course you can't do it from day 1 or even day 5, but my hope is that if 
we relentlessly work on it and improve it and also get a lot of feedback from 
people like you - we can approach this "ideal" where everything is 
self-explanatory and easy to follow. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-01-20 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-576475634
 
 
   > If it makes you feel better, just chock it up to me being an idiot and I 
promise I’ll never bother this community again.
   
   First of all it's not a matter of making me feel better at all. I certainly 
don't think of you as an idiot, and I would never, ever call anyone idiot, just 
because that person has no time, or bandwidth or capacity, to read and 
understand the TON of documentation we have. And if anything - that's a sign 
that maybe we should do better. You seem to be competent and you know how to 
program and fix things on your own and do your best to do it - I certainly 
cannot blame you that you do not know our complex setup.
   
   And Airflow IS a complex project. It has a LOT of moving parts and my goal 
is to make it approachable to even one-time contributors and my ultimate goal 
is to make it as easy and painless as possible - while keeping the high bar on 
quality, test automation, check automation etc. 
   
   I am just curious what are the obstacles to that - and whether we could make 
it somewhat easier to someone like you who just comes to the project from 
outside for a short time. That's why I am looking for some suggestions/areas of 
improvements. I know the project too well myself to see the obstacles - there 
is even a term for that "Expert Blindness" - http://mnav.com/expert-blindness/ 
that's why I am looking for help and suggestions of people like you.
   
   To have the project much easier approachable is my goal from the very 
beginning of the project and - believe me - a year ago it took literally days 
to set it all up so that you could run tests. Now (if we do not have sudden 
breakage of master by 3rd party dependencies) you can have it up and running in 
10 minutes (with fast network).
   
   I think so far - by looking of what you've done, I will make a few 
improvements:
   
   - I am going to make "Breeze" THE default dev environment (we have it much 
lighter as of a few weeks) and if you had it installed (by simply running 
./breeze command), then pre-commit would just work out-of-the box
   
   or (and?)
   
   - we can direct new contributors to simply follow the CI messages - they are 
descriptive enough and should be good to start
   
   I already added a few improvements to Contributing and we even have this PR 
in progress to explain how communication is expected to look like : 
https://github.com/apache/airflow/pull/7204 - I will be happy if you add your 
comments there.
   
   WDYT? Anything else you think we can improve to make it more approachable?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-01-20 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-576470784
 
 
   > The linting appears to be all set, but since Singularity is installed, the 
tests are obviously going to fail. Let me know how you would like to proceed - 
it's a non trivial thing to get Singularity installed from within Travis.
   
   I don't think we have enough singularity experience to add it. We approached 
in various ways for other integrations:
   - some of the integrations (mongo etc.) are installed as docker-compose 
images (for example here: 
https://github.com/apache/airflow/blob/master/scripts/ci/docker-compose/integration-mongo.yml)
 
   - some of them (Kubernetes) are installed via kind (Kubernetes-in-Docker) - 
here: 
https://github.com/apache/airflow/blob/master/scripts/ci/in_container/kubernetes/setup_kind_cluster.sh
   - some of them we install inside the docker container via regular Dockerfile 
image build instructions  (minicluster/hive/hadoop) - 
https://github.com/apache/airflow/blob/3730c24c41470cd331c5109539ee2fa0c9f4e74a/Dockerfile#L152
 https://github.com/apache/airflow/blob/master/Dockerfile#L185
   
   I do not know if/whether any of such options is possible for Singularity - 
please take a look and see  if this is possible to use either of the options 
above?
   
   If we find integration too difficult - In a number cases we rely on "unit 
tests" where we mock the external systems and mark the tests as skipped - we 
have a custom marker `pytest.mark.integration` that does it: 
https://github.com/apache/airflow/blob/master/TESTING.rst#running-integration-tests
 . If you have `pytest.mark.integration("singularity")` as decorator, the test 
will be skipped unless you have INTEGRATION_SINGULARITY env variable defined. 
You can leave such tests in the code and manually test it when you have 
singularity (but it's best if we can figure out a way to add it to the 
docker-compose we are already running).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add singularity to airflow

2020-01-20 Thread GitBox
potiuk commented on issue #7191: [AIRFLOW-4030] second attempt to add 
singularity to airflow
URL: https://github.com/apache/airflow/pull/7191#issuecomment-576467652
 
 
   > Also, could you give feedback on:
   > 
   > > Title Validator — Wrong commit title: adding singularity operator and 
tests
   > 
   > What is wrong about it?
   
   I think it's quite clearly described - it's about the "commit title". As 
mentioned in the PR description:
   
   > Commit message/PR title starts with [AIRFLOW-]. AIRFLOW- = JIRA ID*
   
   Commit message should start with [AIRFLOW-] similarly as PR description. 
You can run `git commit --amend` to change it and then `git push --force` to 
push it. It's also quite clearly described in the "Pull Request guidelines": 
https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines
 with example:
   
   > Preface your commit's subject & PR title with [AIRFLOW-XXX] COMMIT_MSG 
where XXX is the JIRA number. For example: [AIRFLOW-5574] Fix Google Analytics 
script loading. 
   
   I have a question @vsoch  - it seems that for some reason it is quite 
difficult to follow (for you at least). We have not only good and comprehensive 
documentation but also we have spend quite a lot of time to have messages 
printed either by the pre-commit system, or docs and require additional 
explanation (mostly repeating what is in the documentation already) - and 
pretty much careful reading the error messages could (I think) prevent asking 
extra questions - that was at least the goal. The messages are designed to 
guide you and In my answers I mostly repeat the messages already written there 
when you ask the question and link to already existing documentation. 
   
   I am really curious why it happens. Maybe we can improve it somehow - is the 
documentation not written clearly enough? Is there any problem with surfacing 
the error messages ? Or maybe there is a problem with finding relevant 
information? Or maybe you need more time to get familiar with the tools we are 
using ? 
   
   We tried (with professional technical writer) to place the documentation 
exactly where it is supposed to be, plus we push it exactly at the place where 
problems happen (for example in pre-commit checks) and try to make it as clear 
as possible - but apparently you still have problems with finding it. I am 
really curious - would it be possible for you to explain how are you 
approaching it, how are you reading (or maybe not?) the messages you get and 
how you are understanding (or not) what we try to communicate. 
   
   As a novice user you are the best "test" for it - maybe also you can have 
some ideas, proposals or maybe even a Pull Request that can improve it and make 
the documentation/messages better?
   
   Looking forward to more insight.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services