another to do when pulling from many repositories is to balance the
languages included, so there are the same number

i know there is a way of handling when that issue is present, but i do
not know what it is, and it seems to me that balancing the set the
model is built off of would be the most robust approach

currently the data is pretty c-family heavily. when i tried to use it
for a python file it ignored the .py ending and made c code

Reply via email to